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YIELD-RELATED POLYNUCLEOTIDES AND POLYPEPTIDES IN PLANTS 

This application claims the benefit of US Provisional Application No. 
60/310,847, filed August 9, 2001, US Provisional Application No. 60/336,049, filed 
December 5, 2001, US Provisional Application No. 60/338,692, filed December 11, 
2001, and US Non-provisional Application No. 10/171,468, filed June 14, 2002, the 
entire contents of which are hereby incorporated by reference. 

FIELD OF THE INVENTION 

This invention relates to the field of plant biology. More particularly, the 
present invention pertains to compositions and methods for phenotypically modifying 
a plant. 

INTRODUCTION 

A plant's traits, such as its biochemical, developmental, or phenotypic 
characteristics, may be controlled through a number of cellular processes. One 
important way to manipulate that control is through transcription factors - proteins 
that influence the expression of a particular gene or sets of genes. Transformed and 
transgenic plants that comprise cells having altered levels of at least one selected 
transcription factor, for example, possess advantageous or desirable traits. Strategies 
for manipulating traits by altering a plant cell's transcription factor content can 
therefore result in plants and crops with commercially valuable properties. Applicants 
have identified polynucleotides encoding transcription factors, developed numerous 
transgenic plants using these polynucleotides, and have analyzed the plants for a 
variety of important traits. In so doing, applicants have identified important 
polynucleotide and polypeptide sequences for producing commercially valuable 
plants and crops as well as the methods for making them and using them. Other 
aspects and embodiments of the invention are described below and can be derived 
from the teachings of this disclosure as a whole. 

BACKGROUND OF THE INVENTION 

Transcription factors (TFs) can modulate gene expression, either increasing or 
decreasing (inducing or repressing) the rate of transcription. This modulation results 
in differential levels of gene expression at various developmental stages, in different 
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tissues and cell types, and in response to different exogenous (e.g., environmental) 
and endogenous stimuli throughout the life cycle of the organism. 

Because transcription factors are key controlling elements of biological 
pathways, altering the expression levels of one or more transcription factors can 
change entire biological pathways in an organism. For example, manipulation of the 
levels of selected transcription factors may result in increased expression of 
economically useful proteins or metabolic chemicals in plants or to improve other 
agriculturally relevant characteristics. Conversely, blocked or reduced expression of a 
transcription factor may reduce biosynthesis of unwanted compounds or remove an 
undesirable trait. Therefore, manipulating transcription factor levels in a plant offers 
tremendous potential in agricultural biotechnology for modifying a plant's traits. 

The present invention provides novel transcription factors useful for 
modifying a plant's phenotype in desirable ways. 

SUMMARY OF THE INVENTION 

In a first aspect, the invention relates to a recombinant polynucleotide 
comprising a nucleotide sequence selected from the group consisting of: (a) a 
nucleotide sequence encoding a polypeptide comprising a polypeptide sequence 
selected from those of the Sequence Listing, SEQ ED NOs:2 to 2N, where N = 2-561, 
or those listed in Table 4, or a complementary nucleotide sequence thereof; (b) a 
nucleotide sequence encoding a polypeptide comprising a variant of a polypeptide of 
(a) having one or more, or between 1 and about 5, or between 1 and about 10, or 
between 1 and about 30, conservative amino acid substitutions; (c) a nucleotide 
sequence comprising a sequence selected from those of SEQ ID NOs:l to (2N - 1), 
where N = 2-561, or those included in Table 4, or a complementary nucleotide 
sequence thereof; (d) a nucleotide sequence comprising silent substitutions in a 
nucleotide sequence of (c); (e) a nucleotide sequence which hybridizes under stringent 
conditions over substantially the entire length of a nucleotide sequence of one or more 
of: (a), (b), (c), or (d); (f) a nucleotide sequence comprising at least 10 or 15, or at 
least about 20, or at least about 30 consecutive nucleotides of a sequence of any of 
(a)-(e), or at least 10 or 15, or at least about 20, or at least about 30 consecutive 
nucleotides outside of a region encoding a conserved domain of any of (a)-(e); (g) a 



WO 03/013227 PCT/US02/25805 

nucleotide sequence comprising a subsequence or fragment of any of (a)-(f), which 
subsequence or fragment encodes a polypeptide having a biological activity that 
modifies a plant's characteristic, functions as a transcription factor, or alters the level 
of transcription of a gene or transgene in a cell; (h) a nucleotide sequence having at 
least 31% sequence identity to a nucleotide sequence of any of (a)-(g); (i) a 
nucleotide sequence having at least 60%, or at least 70 %, or at least 80 %, or at least 
90 %, or at least 95 % sequence identity to a nucleotide sequence of any of (a)-(g) or a 
10 or 15 nucleotide, or at least about 20, or at least about 30 nucleotide region of a 
sequence of (a)-(g) that is outside of a region encoding a conserved domain; (j) a 
nucleotide sequence that encodes a polypeptide having at least 31% sequence identity 
to a polypeptide listed in Table 4, or the Sequence Listing; (k) a nucleotide sequence 
which encodes a polypeptide having at least 60%, or at least 70 %, or at least 80%, or 
at least 90 %, or at least 95 % sequence identity to a polypeptide listed in Table 4, or 
the Sequence Listing; and (1) a nucleotide sequence that encodes a conserved domain 
of a polypeptide having at least 85%, or at least 90%, or at least 95%, or at least 98% 
sequence identity to a conserved domain of a polypeptide listed in Table 4, or the 
Sequence Listing. The recombinant polynucleotide may further comprise a 
constitutive, inducible, or tissue-specific promoter operably linked to the nucleotide 
sequence. The invention also relates to compositions comprising at least two of the 
above-described polynucleotides. 

In a second aspect, the invention comprises an isolated or recombinant 
polypeptide comprising a subsequence of at least about 10, or at least about 15, or at 
least about 20, or at least about 30 contiguous amino acids encoded by the 
recombinant or isolated polynucleotide described above, or comprising a subsequence 
of at least about 8, or at least about 12, or at least about 15, or at least about 20, or at 
least about 30 contiguous amino acids outside a conserved domain. 

In a third aspect, the invention comprises an isolated or recombinant 
polynucleotide that encodes a polypeptide that is a paralog of the isolated polypeptide 
described above. In one aspect, the invention is an paralog which, when expressed in 
Arabidopsis, modifies a trait of the Arabidopsis plant. 
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In a fourth aspect, the invention comprises an isolated or recombinant 
polynucleotide that encodes a polypeptide that is an ortholog of the isolated 
polypeptide described above. In one aspect, the invention is an ortholog which, when 
expressed in Arabidopsis* modifies a trait of the Arabidopsis plant. 

In a fifth aspect, the invention comprises an isolated polypeptide that is a 
paralog of the isolated polypeptide described above. In one aspect, the invention is an 
paralog which, when expressed in Arabidopsis ', modifies a trait of the Arabidopsis 
plant. 

In a sixth aspect, the invention comprises an isolated polypeptide that is an 
ortholog of the isolated polypeptide described above. In one aspect, the invention is 
an ortholog which, when expressed in Arabidopsis 9 modifies a trait of the i 
Arabidopsis plant. 

The present invention also encompasses transcription factor variants. A 
preferred transcription factor variant is one having at least 40% amino acid sequence 
identity, a more preferred transcription factor variant is one having at least 50% amino 
acid sequence identity and a most preferred transcription factor variant is one having 
at least 65% amino acid sequence identity to the transcription factor amino acid 
sequence SEQ ID NOs:2 to 2N, where N = 2-561 , and which contains at least one 
functional or structural characteristic of the transcription factor amino acid sequence. 
Sequences having lesser degrees of identity but comparable biological activity are 
considered to be equivalents. 

In another aspect, the invention is a transgenic plant comprising one or more 
of the above-described isolated or recombinant polynucleotides. In yet another 
aspect, the invention is a plant with altered expression levels of a polynucleotide 
described above or a plant with altered expression or activity levels of an above- 
described polypeptide. Further, the invention is a plant lacking a nucleotide sequence 
encoding a polypeptide described above or substantially lacking a polypeptide 
described above. The plant may be any plant, including, but not limited to, 
Arabidopsis, mustard, soybean, wheat, com, potato, cotton, rice, oilseed rape, 
sunflower, alfalfa, sugarcane, turf, banana, blackberry, blueberry, strawberry, 
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raspberry, cantaloupe, carrot, cauliflower, coffee, cucumber, eggplant, grapes, 
honeydew, lettuce, mango, melon, onion, papaya, peas, peppers, pineapple, pumpkin, 
spinach, squash, sweet corn, tobacco, tomato, watermelon, rosaceous fruits, vegetable 
brassicas, and mint or other labiates. In yet another aspect, the inventions is an 
isolated plant material of a plant, including, but not limited to, plant tissue, fruit, seed, 
plant cell, embryo, protoplast, pollen, and the like. In yet another aspect, the 
invention is a transgenic plant tissue culture of regenerable cells, including, but not 
limited to, embryos, meristematic cells, microspores, protoplast, pollen, and the like. 

In yet another aspect the invention is a transgenic plant comprising one or 
more of the above described polynucleotides wherein the encoded polypeptide is 
expressed and regulates transcription of a gene. 

In a further aspect the invention provides a method of using the polynucleotide 
composition to breed a progeny plant from a transgenic plant including crossing 
plants, producing seeds from transgenic plants, and methods of breeding using 
transgenic plants, the method comprising transforming a plant with the polynucleotide 
composition to create a transgenic plant, crossing the transgenic plant with another 
plant, selecting seed, and growing the progeny plant from the seed. 

In a further aspect, the invention provides a progeny plant derived from a 
parental plant wherein said progeny plant exhibits at least three fold greater 
messenger RNA levels than said parental plant, wherein the messenger RNA encodes 
a DNA-binding protein which is capable of binding to a DNA regulatory sequence 
and inducing expression of a plant trait gene, wherein the progeny plant is 
characterized by a change in the plant trait compared to said parental plant. In yet a 
further aspect, the progeny plant exhibits at least ten fold greater messenger RNA 
levels compared to said parental plant. In yet a further aspect, the progeny plant 
exhibits at least fifty fold greater messenger RNA levels compared to said parental 
plant. 

In a further aspect, the invention relates to a cloning or expression vector 
comprising the isolated or recombinant polynucleotide described above or cells 
comprising the cloning or expression vector. 



WO 03/013227 



PCTAJS02/25805 



In yet a further aspect, the invention relates to a composition produced by 
incubating a polynucleotide of the invention with a nuclease, a restriction enzyme, a 
polymerase; a polymerase and a primer, a cloning vector, or with a cell. 

Furthermore, the invention relates to a method for producing a plant having a 
modified trait. The method comprises altering the expression of an isolated or 
recombinant polynucleotide of the invention or altering the expression or activity of a 
polypeptide of the invention in a plant to produce a modified plant, and selecting the 
modified plant for a modified trait. In one aspect, the plant is a monocot plant. In 
another aspect, the plant is a dicot plant. In another aspect the recombinant 
polynucleotide is from a dicot plant and the plant is a monocot plant. In yet another 
aspect the recombinant polynucleotide is from a monocot plant and the plant is a dicot 
plant. In yet another aspect the recombinant polynucleotide is from a monocot plant 
and the plant is a monocot plant. In yet another aspect the recombinant 
polynucleotide is from a dicot plant and the plant is a dicot plant. 

In another aspect, the invention is a transgenic plant comprising an isolated or 
recombinant polynucleotide encoding a polypeptide wherein the polypeptide is 
selected from the group consisting of SEQ ID NOs: 2 - 2N, where N = 2-561. In yet 
another aspect, the invention is a plant with altered expression levels of a polypeptide 
described above or a plant with altered expression or activity levels of an above- 
described polypeptide. Further, the invention is a plant lacking a polynucleotide 
sequence encoding a polypeptide described above or substantially lacking a 
polypeptide described above. The plant may be any plant, including, but not limited 
to, Arabidopsis, mustard, soybean, wheat, corn, potato, cotton, rice, oilseed rape, 
sunflower, alfalfa, sugarcane, turf, banana, blackberry, blueberry, strawberry, 
raspberry, cantaloupe, carrot, cauliflower, coffee, cucumber, eggplant, grapes, 
honeydew, lettuce, mango, melon, onion, papaya, peas, peppers, pineapple, pumpkin, 
spinach, squash, sweet corn, tobacco, tomato, watermelon, rosaceous fruits, vegetable 
brassicas, and mint or other labiates. In yet another aspect, the inventions is an 
isolated plant material of a plant, including, but not limited to, plant tissue, fruit, seed, 
plant cell, embryo, protoplast, pollen, and the like. In yet another aspect, the 
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invention is a transgenic plant tissue culture of regenerable cells, including, but not 
limited to, embryos, meristematic cells, microspores, protoplast, pollen, and the like. 



In another aspect, the invention relates to a method of identifying a factor that 
is modulated by or interacts with a polypeptide encoded by a polynucleotide of the 
invention. The method comprises expressing a polypeptide encoded by the 
polynucleotide in a plant; and identifying at least one factor that is modulated by or 
interacts with the polypeptide. In one embodiment the method for identifying 
modulating or interacting factors is by detecting binding by the polypeptide to a 
promoter sequence, or by detecting interactions between an additional protein and the 
polypeptide in a yeast two hybrid system, or by detecting expression of a factor by 
hybridization to a microarray, subtractive hybridization, or differential display. 

In yet another aspect, the invention is a method of identifying a molecule that 
modulates activity or expression of a polynucleotide or polypeptide of interest. The 
method comprises placing the molecule in contact with a plant comprising the 
polynucleotide or polypeptide encoded by the polynucleotide of the invention and 
monitoring one or more of the expression level of the polynucleotide in the plant, the 
expression level of the polypeptide in the plant, and modulation of an activity of the 
polypeptide in the plant. 

In yet another aspect, the invention relates to an integrated system, computer 
or computer readable medium comprising one or more character strings 
corresponding to a polynucleotide of the invention, or to a polypeptide encoded by the 
polynucleotide. The integrated system, computer or computer readable medium may 
comprise a link between one or more sequence strings to a modified plant trait. 

In yet another aspect, the invention is a method for identifying a sequence 
similar or homologous to one or more polynucleotides of the invention, or one or 
more polypeptides encoded by the polynucleotides. The method comprises providing 
a sequence database, and querying the sequence database with one or more target 
sequences corresponding to the one or more polynucleotides or to the one or more 
polypeptides to identify one or more sequence members of the database that display 
sequence similarity or homology to one or more of the one or more target sequences. 
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The method may further comprise of linking the one or more of the 
polynucleotides of the invention, or encoded polypeptides, to a modified plant 
phenotype. 

BRIEF DESCRIPTION OF THE SEQUENCE LISTING, 
TABLES, AND FIGURE 

The Sequence Listing provides exemplary polynucleotide and polypeptide 
sequences of the invention. The traits associated with the use of the sequences are 
included in the Examples. 

Diskette 1 is a read-only memory computer-readable diskette and contains a 
copy of the Sequence Listing in ASCII text format. The Sequence Listing is named 
"SEQLIST5 14442002041" and is 929 kilobytes in size. The copy of the Sequence 
Listing on the diskette is hereby incorporated by reference in its entirety. 

Table 4 shows the polynucleotides and polypeptides identified by SEQ ID 
NO; Mendel Gene ID No.; conserved domain of the polypeptide; and if the 
polynucleotide was tested in a transgenic assay. The first column shows the 
polynucleotide SEQ ID NO; the second column shows the Mendel Gene ID No., GID; 
the third column shows the trait(s) resulting from the knock out or overexpression of 
the polynucleotide in the transgenic plant; the fourth column shows the category of 
the trait; the fifth column shows the transcription factor family to which the 
polynucleotide belongs; the sixth column ("Comment"), includes specific effects and 
utilities conferred by the polynucleotide of the first column; the seventh column 
shows the SEQ ID NO of the polypeptide encoded by the polynucleotide; and the 
eighth column shows the amino acid residue positions of the conserved domain in 
amino acid (AA) co-ordinates. 

Table 5 lists a summary of orthologous and homologous sequences identified 
using BLAST (tblastx program). The first column shows the polynucleotide sequence 
identifier (SEQ ID NO), the second column shows the corresponding cDNA identifier 
(Gene ID), the third column shows the orthologous or homologous polynucleotide 
GenBank Accession Number (Test Sequence ID), the fourth column shows the 
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calculated probability value that the sequence identity is due to chance (Smallest Sum 
Probability), the fifth column shows the plant species from which the test sequence 
was isolated (Test Sequence Species), and the sixth column shows the orthologous or 
homologous test sequence GenBank annotation (Test Sequence GenBank 
Annotation). 

Figure 1 shows a phylogenic tree of related plant families adapted from Daly 
et al. (2001 Plant Physiology 127:1328-1333). 

Detailed Description of Exemplary Embodiments 

In an important aspect, the present invention relates to polynucleotides and 
polypeptides, e.g. for modifying phenotypes of plants. Throughout this disclosure, 
various information sources are referred to and/or are specifically incorporated. The 
information sources include scientific journal articles, patent documents, textbooks, 
and World Wide Web browser-inactive page addresses, for example. While the 
reference to these information sources clearly indicates that they can be used by one 
of skill in the art, applicants specifically incorporate each and every one of the 
information sources cited herein, in their entirety, whether or not a specific mention of 
"incorporation by reference" is noted. The contents and teachings of each and every 
one of the information sources can be relied on and used to make and use 
embodiments of the invention. 

It must be noted that as used herein and in the appended claims, the singular 
forms "a," "an," and "the" include plural reference unless the context clearly dictates 
otherwise. Thus, for example, a reference to "a plant" includes a plurality of such 
plants, and a reference to "a stress" is a reference to one or more stresses and 
equivalents thereof known to those skilled in the art, and so forth. 

The polynucleotide sequences of the invention encode polypeptides that are 
members of well-known transcription factor families, including plant transcription 
factor families, as disclosed in Table 4. Generally, the transcription factors encoded 
by the present sequences are involved in cell differentiation and proliferation and the 
regulation of growth. Accordingly, one skilled in the art would recognize that by 
expressing the present sequences in a plant, one may change the expression of 
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autologous genes or induce the expression of introduced genes. By affecting the 
expression of similar autologous sequences in a plant that have the biological activity 
of the present sequences, or by introducing the present sequences into a plant, one 
may alter a plant's phenotype to one with improved traits. The sequences of the 
invention may also be used to transform a plant and introduce desirable traits not 
found in the wild-type cultivar or strain. Plants may then be selected for those that 
produce the most desirable degree of over- or underexpression of target genes of 
interest and coincident trait improvement. 

The sequences of the present invention may be from any species, particularly 
plant species, in a naturally occurring form or from any source whether natural, 
synthetic, semi-synthetic or recombinant. The sequences of the invention may also 
include fragments of the present amino acid sequences. In this context, a "fragment" 
refers to a fragment of a polypeptide sequence which is at least 5 to about 15 amino 
acids in length, most preferably at least 14 amino acids, and which retain some 
biological activity of a transcription factor. Where "amino acid sequence" is recited to 
refer to an amino acid sequence of a naturally occurring protein molecule, "amino 
acid sequence" and like terms are not meant to limit the amino acid sequence to the 
complete native amino acid sequence associated with the recited protein molecule. 

As one of ordinary skill in the art recognizes, transcription factors can be 
identified by the presence of a region or domain of structural similarity or identity to a 
specific consensus sequence or the presence of a specific consensus DNA-binding site 
or DNA-binding site motif (see, for example, Riechmann et al., (2000) Science 290: 
21 05-21 10). The plant transcription factors.may belong to one of the following 
transcription factor families: the AP2 (APETALA2) domain transcription factor 
family (Riechmann and Meyerowitz (1998) Biol Chem. 379:633-646); the MYB 
transcription factor family (Martin and Paz-Ares, (1997) Trends Genet 13:67-73); the 
MADS domain transcription factor family (Riechmann and Meyerowitz (1997) Biol 
Chem. 378:1079-1 101); the WRKY protein family (Ishiguro and Nakamura (1994) 
Mol Gen. Genet. 244:563-571); the ankyrin-repeat protein family (Zhang et al. 
(1992) Plant Cell 4:1575-1588); the zinc finger protein (Z) family (Klug and Schwabe 
(1995) FASEB J. 9: 597-604); the homeobox (HB) protein family (Buerglin m 
Guidebook to the Homeobox Genes, Duboule (ed.) (1994) Oxford University Press); 
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the CAAT-element binding proteins (Forsburg and Guarente (1989) Genes Dev. 
3:1 166-1 178); the squamosa promoter binding proteins (SPB) (Klein et al. (1996) 
Mol. Gen. Genet. 1996 250:7-16); the NAM protein family (Souer et al. (1996) Cell 
85:159-170); the LAA/AUX proteins (Rouse et al. (1998) Science 279:1371-1373); 
the HLH/MYC protein family (Littlewood et al. (1994) Prot. Profile 1 :639-709); the 
DNA-binding protein (DBP) family (Tucker et al. (1 994) EMBO J. 1 3 :2994-3002); 
the bZIP family of transcription factors (Foster et al. (1994) FASEB J. 8:192-200); the 
Box P-binding protein (the BPF-1) family (da Costa e Silva et al. (1993) Plant J. 
4:125-135); the high mobility group (HMG) family (Bustin and Reeves (1996; Prog. 
Nucl. Acids Res. Mol. Biol. 54:35-100); the scarecrow (SCR) family (Di Laurenzio et 
al. (1996) Cell 86:423-433); the GF14 family (Wu et al. (1997) Plant Physiol. 
114:1421-1431); the polycomb (PCOMB) family (Kennison (1995) Annu. Rev. Genet. 
29:289-303); the teosinte branched (TEO) family (Luo et al. (1996) Nature 383:794- 
799; the ABB family (Giraudat et al. (1992) Plant Cell 4:1251-1261); the triple helix 
(TH) family (Dehesh et al. (1990) Science 250:1397-1399); the EIL family (Chao et 
al. (1997) Cell 89:1 133-44); the AT-HOOK family (Reeves and Nissen (1990) J. Biol. 
Client. 265:8573-8582); the S1FA family (Zhou et al. (1995) Nucleic Acids Res. 
23:1165-1 169); the bZIPT2 family (Lu and Ferl (1995; Plant Physiol. 109:723); the 
YABBY family (Bowman et al. (1999) Development 126:2387-96); the PAZ family 
(Bohmert et al. (1998) EMBO J. 17:170-80); a family of miscellaneous (MISC) 
transcription factors including the DPBF family (Kim et al. (1997) Plant J. 11:1237- 
125 1) and the SPF1 family (Ishiguro and Nakamura (1 994) Mol. Gen. Genet. 
244:563-571); the golden (GLD) family (Hall et al. (1998) Plant Cell 10:925-936), 
the TUBBY family (Boggin et al, (1999) Science 286:21 19-2125), the heat shock 
family (Wu C (1995) Annu Rev Cell Dev Biol 1 1:441-469), the ENBP family 
(Christiansen et al (1996) Plant Mol Biol 32:809-821), the RING-zinc family (Jensen 
et al. (1998; FEBS letters 436:283-287), the PDBP family (Janik et al Virology. 
(1989) 168:320-329), the PCF family (Cubas P, et al. Plant J. (1999) 18:215-22), the 
SRS (SHI-related) family (Fridborg et al Plant Cell (1999) 1 1:1019-1032), the CPP 
(cysteine-rich polycomb-like) family (Cvitanich et al Proc. Natl. Acad. Sci. USA. 
(2000) 97:8163-8168), the ARF (auxin response factor) family (Ulmasov, et al. 
(1999) Proc. Natl. Acad. Sci. USA 96: 5844-5849), the SWI/SNF family 
(Collingwood et al J. Mol. End. 23:255-275), the ACBF family (Seguin et al (1997) 
Plant Mol Biol. 35:281-291), PCGL (CG-1 like) family (da Costa e Silva et al. 
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(1994) Plant Mol Biol 25:921-924) the ARID family (Vazquez et al. (1999) 
Development. 126: 733-42), the Jumonji family, Balciunas et al (2000, Trends 
Biochem Sci. 25: 274-276), the bZIP-NIN family (Schauser et al (1999) Nature 402: 
191-195), the E2F family Kaelin et al (1992) Cell 70: 351-364) and the GRF-like 
family (Knaap et al (2000) Plant Physiol. 122: 695-704). As indicated by any part of 
the list above and as known in the art, transcription factors have been sometimes 
categorized by class, family, and sub-family according to their structural content and 
consensus DNA-binding site motif, for example. Many of the classes and many of the 
families and sub-families are listed here. However, the inclusion of one sub-family 
and not another, or the inclusion of one family and not another, does not mean that the 
invention does not encompass polynucleotides or polypeptides of a certain family or 
sub-family. The list provided here is merely an example of the types of transcription 
factors and the knowledge available concerning the consensus sequences and 
consensus DNA-binding site motifs that help define them as known to those of skill in 
the art (each of the references noted above are specifically incorporated herein by 
reference). A transcription factor may include, but is not limited to, any polypeptide 
that can activate or repress transcription of a single gene or a number of genes. This 
polypeptide group includes, but is not limited to, DNA-binding proteins, DNA- 
binding protein binding proteins, protein kinases, protein phosphatases, GTP-binding 
proteins, and receptors, and the like. 

In addition to methods for modifying a plant phenotype by employing one or 
more polynucleotides and polypeptides of the invention described herein, the 
polynucleotides and polypeptides of the invention have a variety of additional uses. 
These uses include their use in the recombinant production (i.e., expression) of 
proteins; as regulators of plant gene expression, as diagnostic probes for the presence 
of complementary or partially complementary nucleic acids (including for detection 
of natural coding nucleic acids); as substrates for further reactions, e.g., mutation 
reactions, PCR reactions, or the like; as substrates for cloning e.g., including digestion 
or ligation reactions; and for identifying exogenous or endogenous modulators of the 
transcription factors. A "polynucleotide" is a nucleic acid sequence comprising a 
plurality of polymerized nucleotides, e.g., at least about 15 consecutive polymerized 
nucleotides, optionally at least about 30 consecutive nucleotides, at least about 50 
consecutive nucleotides. In many instances, a polynucleotide comprises a nucleotide 
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sequence encoding a polypeptide (or protein) or a domain or fragment thereof. 
Additionally, the polynucleotide may comprise a promoter, an intron, an enhancer 
region, a polyadenylation site, a translation initiation site, 5' or 3' untranslated 
regions, a reporter gene, a selectable marker, or the like. The polynucleotide can be 
single stranded or double stranded DNA or KNA. The polynucleotide optionally 
comprises modified bases or a modified backbone. The polynucleotide can be, e.g., 
genomic DNA or RNA, a transcript (such as an mRNA), a cDNA, a PCR product, a 
cloned DNA, a synthetic DNA or RNA, or the like. The polynucleotide can comprise 
a sequence in either sense or antisense orientations. 

A "recombinant polynucleotide" is a polynucleotide that is not in its native 
state, e.g., the polynucleotide comprises a nucleotide sequence not found in nature, or 
the polynucleotide is in a context other than that in which it is naturally found, e.g., 
separated from nucleotide sequences with which it typically is in proximity in nature, 
or adjacent (or contiguous with) nucleotide sequences with which it typically is not in 
proximity. For example, the sequence at issue can be cloned into a vector, or 
otherwise recombined with one or more additional nucleic acid. 

An "isolated polynucleotide" is a polynucleotide whether naturally occurring 
or recombinant, that is present outside the cell in which it is typically found in nature, 
whether purified or not. Optionally, an isolated polynucleotide is subject to one or 
more enrichment or purification procedures, e.g., cell lysis, extraction, centrifugation, 
precipitation, or the like. 

A "polypeptide" is an amino acid sequence comprising a plurality of 
consecutive polymerized amino acid residues e.g., at least about 15 consecutive 
polymerized amino acid residues, optionally at least about 30 consecutive 
polymerized amino acid residues, at least about 50 consecutive polymerized amino 
acid residues. In many instances, a polypeptide comprises a polymerized amino acid 
residue sequence that is a transcription factor or a domain or portion or fragment 
thereof. Additionally, the polypeptide may comprise a localization domain, 2) an 
activation domain, 3) a repression domain, 4) an oligomerization domain or 5) a 
DNA-binding domain, or the like. The polypeptide optionally comprises modified 
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amino acid residues, naturally occurring amino acid residues not encoded by a codon, 
non-naturally occurring amino acid residues. 

A "recombinant polypeptide" is a polypeptide produced by translation of a 
recombinant polynucleotide. A "synthetic polypeptide" is a polypeptide created by 
consecutive polymerization of isolated amino acid residues using methods well 
known in the art. An "isolated polypeptide," whether a naturally occurring or a 
recombinant polypeptide, is more enriched in (or out of) a cell than the polypeptide in 
its natural state in a wild type cell, e.g., more than about 5% enriched, more than 
about 10% enriched, or more than about 20%, or more than about 50%, or more, 
enriched, i.e., alternatively denoted: 105%, 1 10%, 120%, 1, 50% or more, enriched 
relative to wild type standardized at 100%. Such an enrichment is not the result of a 
natural response of a wild type plant. Alternatively, or additionally, the isolated 
polypeptide is separated from other cellular components with which it is typically 
associated, e.g., by any of the various protein purification . methods herein. 

"Identity" or "similarity" refers to sequence similarity between two 
polynucleotide sequences or between two polypeptide sequences, with identity being 
a more strict comparison. The phrases "percent identity" and "% identity" refer to the 
percentage of sequence similarity found in a comparison of two or more 
polynucleotide sequences or two or more polypeptide sequences. Identity or 
similarity can be determined by comparing a position in each sequence that may be 
aligned for purposes of comparison. When a position in the compared sequence is 
occupied by the same nucleotide base or amino acid, then the molecules are identical 
at that position. A degree of similarity or identity between polynucleotide sequences 
is a function of the number of identical or matching nucleotides at positions shared by 
the polynucleotide sequences. A degree of identity of polypeptide sequences is a 
function of the number of identical amino acids at positions shared by the polypeptide 
sequences. A degree of homology or similarity of polypeptide sequences is a function 
of the number of amino acids, i.e., structurally related, at positions shared by the 
polypeptide sequences. 

"Altered" nucleic acid sequences encoding polypeptide include those 
sequences with deletions, insertions, or substitutions of different nucleotides, resulting 
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in a polynucleotide encoding a polypeptide with at least one functional characteristic 
of the polypeptide. Included within this definition are polymorphisms that may or 
may not be readily detectable using a particular oligonucleotide probe of the 
polynucleotide encoding polypeptide, and improper or unexpected hybridization to 
allelic variants, with a locus other than the normal chromosomal locus for the 
polynucleotide sequence encoding polypeptide. The encoded polypeptide protein 
may also be "altered", and may contain deletions, insertions, or substitutions of amino 
acid residues that produce a silent change and result in a functionally equivalent 
polypeptide. Deliberate amino acid substitutions may be made on the basis of 
similarity in residue side chain chemistry, including, but not limited to, polarity, 
charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of 
the residues, as long as the biological activity of polypeptide is retained. For 
example, negatively charged amino acids may include aspartic acid and glutamic acid, 
positively charged amino acids may include lysine and arginine, and amino acids with 
uncharged polar head groups having similar hydrophilicity values may include 
leucine, isoleucine, and valine; glycine and alanine; asparagine and glutamine; serine 
and threonine; and phenylalanine and tyrosine. Alignments between different 
polypeptide sequences may be used to calculate "percentage sequence similarity". 

The term "plant" includes whole plants, shoot vegetative organs/structures 
(e.g, leaves, stems and tubers), roots, flowers and floral organs/structures (e.g., 
bracts, sepals, petals, stamens, carpels, anthers and ovules), seed (including embryo, 
endosperm, and seed coat) and fruit (the mature ovary), plant tissue (e.g., vascular 
tissue, ground tissue, and the like) and cells (e.g., guard cells, egg cells, and the like), 
and progeny of same. The class of plants that can be used in the method of the 
invention is generally as broad as the class of higher and lower plants amenable to 
transformation techniques, including angiosperms (monocotyledonous and 
dicotyledonous plants), gymnosperms, ferns, horsetails, psilophytes, lycophytes, 
bryophytes, and multicellular algae. (See for example, Figure 1, adapted from Daly et 
al. 2001 Plant Physiology 127:1328-1333; and see also Tudge, C, The Variety of 
Life, Oxford University Press, New York, 2000, pp. 547-606.) 

A 'transgenic planf * refers to a plant that contains genetic material not found 
in a wild type plant of the same species, variety or cultivar. The genetic material may 
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include a transgene, an insertional mutagenesis event (such as by transposon or T- 
DNA insertional mutagenesis), an activation tagging sequence, a mutated sequence, a 
homologous recombination event or a sequence modified by chimeraplasty. 
Typically, the foreign genetic material has been introduced into the plant by human 
manipulation, but any method can be used as one of skill in the art recognizes. 

A transgenic plant may contain an expression vector or cassette. The 
expression cassette typically comprises a polypeptide-encoding sequence operably 
linked (i.e., under regulatory control of) to appropriate inducible or constitutive 
regulatory sequences that allow for the expression of polypeptide. The expression 
cassette can be introduced into a plant by transformation or by breeding after 
transformation of a parent plant. A plant refers to a whole plant as well as to a plant 
part, such as seed, fruit, leaf, or root, plant tissue, plant cells or. any other plant 
material, e.g., a plant explant, as well as to progeny thereof, and to in vitro systems 
that mimic biochemical or cellular components or processes in a cell. 

"Ectopic expression or altered expression" in reference to a polynucleotide 
indicates that the pattern of expression in, e.g., a transgenic plant or plant tissue, is 
different from the expression pattern in a wild type plant or a reference plant of the 
same species. The pattern of expression may also be compared with a reference 
expression pattern in a wild type plant of the same species. For example, the 
polynucleotide or polypeptide is expressed in a cell or tissue type other than a cell or 
tissue type in which the sequence is expressed in the wild type plant, or by expression 
at a time other than at the time the sequence is expressed in the wild type plant, or by 
a response to different inducible agents, such as hormones or environmental signals, 
or at different expression levels (either higher or lower) compared with those found in 
a wild type plant. The term also refers to altered expression patterns that are produced 
by lowering the levels of expression to below the detection level or completely 
abolishing expression. The resulting expression pattern can be transient or stable, 
constitutive or inducible. In reference to a polypeptide, the term "ectopic expression 
or altered expression" further may relate to altered activity levels resulting from the 
interactions of the polypeptides with exogenous or endogenous modulators or from 
interactions with factors or as a result of the chemical modification of the 
polypeptides. 
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A "fragment" or "domain," with respect to a polypeptide, refers to a 
subsequence of the polypeptide. In some cases, the fragment or domain, is a 
subsequence of the polypeptide which performs at least one biological function of the 
intact polypeptide in substantially the same manner, or to a similar extent, as does the 
intact polypeptide. For example, a polypeptide fragment can comprise a recognizable 
structural motif or functional domain such as a DNA-binding site or domain that 
binds to a DNA promoter region, an activation domain, or a domain for protein- 
protein interactions. Fragments can vary in size from as few as 6 amino acids to the 
full length of the intact polypeptide, but are preferably at least about 30 amino acids in 
length and more preferably at least about 60 amino acids in length. In reference to a 
polynucleotide sequence, "a fragment" refers to any subsequence of a polynucleotide, 
typically, of at least about 15 consecutive nucleotides, preferably at least about 30 
nucleotides, more preferably at least about 50 nucleotides, of any of the sequences 
provided herein. 

. The invention also encompasses production of DNA sequences that encode 
transcription factors and transcription factor derivatives, or fragments thereof, entirely 
by synthetic chemistry. After production, the synthetic sequence may be inserted into 
any of the many available expression vectors and cell systems using reagents well 
known in the art. Moreover, synthetic chemistry may be used to introduce mutations 
into a sequence encoding transcription factors or any fragment thereof. 

A "conserved domain", with respect to a polypeptide, refers to a domain 
within a transcription factor family which exhibits a higher degree of sequence 
homology, such as at least 65% sequence identity including conservative 
substitutions, and preferably at least 80% sequence identity, and more preferably at 
least 85%, or at least about 86%, or at least about 87%, or at least about 88%, or at 
least about 90%, or at least about 95%, or at least about 98% amino acid residue 
sequence identity of a polypeptide of consecutive amino acid residues. A fragment or 
domain can be referred to as outside a consensus sequence or outside a consensus 
DNA-binding site that is known to exist or that exists for a particular transcription 
factor class, family, or sub-family. In this case, the fragment or domain will not 
include the exact amino acids of a consensus sequence or consensus DNA-binding 
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site of a transcription factor class, family or sub-family, or the exact amino acids of a 
particular transcription factor consensus sequence or consensus DNA-binding site. 
Furthermore, a particular fragment, region, or domain of a polypeptide, or a 
polynucleotide encoding a polypeptide, can be "outside a conserved domain" if all the 
amino acids of the fragment, region, or domain fall outside of a defined conserved 
domain(s) for a polypeptide or protein. The conserved domains for each of 
polypeptides of SEQ ID NOs:2 - 2N, where N = 2-561, are listed in Table 4 as 
described in Example VII. Also, many of the polypeptides of Table 4 have conserved 
domains specifically indicated by start and stop sites. A comparison of the regions of 
the polypeptides in SEQ ID NOs:2 - 2N, where N = 2-561, or of those in Table 4, 
allows one of skill in the art to identify conserved domain(s) for any of the 
polypeptides listed or referred to in this disclosure, including those in Table 4. 

A "trait" refers to a physiological, morphological, biochemical, or physical 
characteristic of a plant or particular plant material or cell. In some instances, this 
characteristic is visible to the human eye, such as seed or plant size, or can be 
measured by biochemical techniques, such as detecting the protein, starch, or oil 
content of seed or leaves, or by observation of a metabolic or physiological process, 
e.g. by measuring uptake of carbon dioxide, or by the observation of the expression 
level of a gene or genes, e.g., by employing Northern analysis, RT-PCR, microarray 
gene expression assays, or reporter gene expression systems, or by agricultural 
observations such as stress tolerance, yield, or pathogen tolerance. Any technique can 
be used to measure the amount of, comparative level of, or difference in any selected 
chemical compound or macromolecule in the transgenic plants, however. 

"Trait modification" refers to a detectable difference in a characteristic in a 
plant ectopically expressing a polynucleotide or polypeptide of the present invention 
relative to a plant not doing so, such as a wild type plant. In some cases, the trait 
modification can be evaluated quantitatively. For example, the trait modification can 
entail at least about a 2% increase or decrease in an observed trait (difference), at least 
a 5% difference, at least about a 10% difference, at least about a 20% difference, at 
least about a 30%, at least about a 50%, at least about a 70%, or at least about a 100%, 
or an even greater difference compared with a wild type plant. It is known that there 
can be a natural variation in the modified trait. Therefore, the trait modification 
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observed entails a change of the normal distribution of the trait in the plants compared 
with the distribution observed in wild type plant. 

I. Traits Which May Be Modified 

Trait modifications of particular interest include those to seed (such as embryo 
or endosperm), fruit, root, flower, leaf, stem, shoot, seedling or the like, including: * 
enhanced tolerance to environmental conditions including freezing, chilling, heat, 
drought, water saturation, radiation and ozone; improved tolerance to microbial, 
fungal or viral diseases; improved tolerance to pest infestations, including nematodes, 
mollicutes, parasitic higher plants or the like; decreased herbicide sensitivity; 
improved tolerance of heavy metals or enhanced ability to take up heavy metals; 
improved growth under poor photoconditions (e.g., low light and/or short day length), 
or changes in expression levels of genes of interest. Other phenotype that can be 
modified relate to the production of plant metabolites, such as variations in the 
production of taxol, tocopherol, tocotrienol, sterols, phytosterols, vitamins, wax 
monomers, anti-oxidants, amino acids, lignins, cellulose, tannins, prenyllipids (such 
as chlorophylls and carotenoids), glucosinolates, and terpenoids, enhanced or 
compositionally altered protein or oil production (especially in seeds), or modified 
sugar (insoluble or soluble) and/or starch composition. Physical plant characteristics 
that can be modified include cell development (such as the number of trichomes), fruit 
and seed size and number, yields of plant parts such as stems, leaves, inflorescences, 
and roots, the stability of the seeds during storage, characteristics of the seed pod 
(e.g., susceptibility to shattering), root hair length and quantity, internode distances, or 
the quality of seed coat. Plant growth characteristics that can be modified include 
growth rate, germination rate of seeds, vigor of plants and seedlings, leaf and flower 
senescence, male sterility, apomixis, flowering time, flower abscission, rate of 
nitrogen uptake, osmotic sensitivity to soluble sugar concentrations, biomass or 
transpiration characteristics, as well as plant architecture characteristics such as apical 
dominance, branching patterns, number of organs, organ identity, organ shape or size. 

II. Transcription Factors Modify Expression Of Endogenous Genes 

Expression of genes which encode transcription factors that modify expression 
of endogenous genes, polynucleotides, and proteins are well known in the art. In 
addition, transgenic plants comprising isolated polynucleotides encoding transcription 
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factors may also modify expression of endogenous genes, polynucleotides, and 
proteins. Examples include Peng et al. (1997, Genes and Development 1 1 :3194- 
3205) and Peng et al. (1999, Nature, 400:256-261). In addition, many others have 
demonstrated that an Arabidopsis transcription factor expressed in an exogenous plant 
species elicits the same or very similar phenotypic response. See, for example, Fu et 
al. (2001, Plant Cell 13:1791-1802); Nandi et al. (2000, Curr. Biol. 10:215-218); 
Coupland (1995, Nature 377:482-483); and Weigel andNilsson (1995, Nature 
377:482-500). 

In another example, Mandel et al. (1992, Cell 71-133-143) and Suzuki et al. 
(2001, Plant J. 28:409-418) teach that a transcription factor expressed in another plant 
species elicits the same or very similar phenotypic response of the endogenous 
sequence, as often predicted in earlier studies of Arabidopsis transcription factors in 
Arabidopsis (see Mandel et al., 1992, supra; Suzuki et al., 2001, supra). 

Other examples include Muller et al. (2001, Plant J. 28:169-179); Kim et al. 
(2001, Plant J. 25:247-259); Kyozuka and Shimamoto (2002, Plant Cell Physiol. 
43:130-135); Boss and Thomas (2002, Nature, 416:847-850); He et al. (2000, 
TransgenicRes., 9:223-227); and Robson et al. (2001, Plant J. 28:619-631). 

In yet another example, Gilmour et al. (1998, Plant J. 16:433-442) teach an 
Arabidopsis AP2 transcription factor, CBF1, which, when overexpressed in transgenic 
plants, increases plant freezing tolerance. Jaglo et al (2001, Plant Physiol. 127:910- 
917) further identified sequences in Brassica napus which encode CBF-like genes and 
that transcripts for these genes accumulated rapidly in response to low temperature. 
Transcripts encoding CBF-like proteins were also found to accumulate rapidly in 
response to low temperature in wheat, as well as in tomato. An alignment of the CBF 
proteins from Arabidopsis, B. napus, wheat, rye, and tomato revealed the presence of 
conserved amino acid sequences, PKK/RPAGRxKFxETRHP and DSAWR, that 
bracket the AP2/EREBP DNA binding domains of the proteins and distinguish them 
from other members of the AP2/EREBP protein family. (See Jaglo et al., supra.) 

HI. Polypeptides and Polynucleotides of the Invention 
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The present invention provides, among other things, transcription factors 
(TFs), and transcription factor homologue polypeptides, and isolated or recombinant 
polynucleotides encoding the polypeptides, or novel variant polypeptides or 
polynucleotides encoding novel variants of transcription factors derived from the 
specific sequences provided here. These polypeptides and polynucleotides may be 
employed to modify a plant's characteristic. 

Exemplary polynucleotides encoding the polypeptides of the invention were 
identified in the Arabidopsis thaliana GenBank database using publicly available 
sequence analysis programs and parameters. Sequences initially identified were then 
further characterized to identify sequences comprising specified sequence strings 
corresponding to sequence motifs present in families of known transcription factors. 
In addition, further exemplary polynucleotides encoding the polypeptides of the 
invention were identified in the plant GenBank database using publicly available 
sequence analysis programs and parameters. Sequences initially identified were then 
further characterized to identify sequences comprising specified sequence strings 
corresponding to sequence motifs present in families of known transcription factors. 
Polynucleotide sequences meeting such criteria were confirmed as transcription 
factors. 

Additional polynucleotides of the invention were identified by screening 
Arabidopsis thaliana and/or other plant cDNA libraries with probes corresponding to 
known transcription factors under low stringency hybridization conditions. 
Additional sequences, including full length coding sequences were subsequently 
recovered by the rapid amplification of cDNA ends (RACE) procedure, using a 
commercially available kit according to the manufacturer's instructions. Where 
necessary, multiple rounds of RACE are performed to isolate 5' and 3' ends. The full 
length cDNA was then recovered by a routine end-to-end polymerase chain reaction 
(PCR) using primers specific to the isolated 5' and 3' ends. Exemplary sequences are 
provided in the Sequence Listing. 

The polynucleotides of the invention can be or were ectopically expressed in 
overexpressor or knockout plants and the changes in the characteristic(s) or trait(s) of 
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the plants observed. Therefore, the polynucleotides and polypeptides can be 
employed to improve the characteristics of plants. 



The polynucleotides of the invention can be or were ectopically expressed in 
overexpressor plant cells and the changes in the expression levels of a number of 
genes, polynucleotides, and/or proteins of the plant cells observed Therefore, the 
polynucleotides and polypeptides can be employed to change expression levels of a 
genes, polynucleotides, and/or proteins of plants. 

IV. Producing Polypeptides 

The polynucleotides of the invention include sequences that encode 
transcription factors and transcription factor homologue polypeptides and sequences 
complementary thereto, as well as unique fragments of coding sequence, or sequence 
complementary thereto. Such polynucleotides can be, e.g., DNA or RNA, e.g., 
mRNA, cRNA, synthetic RNA, genomic DNA, cDNA synthetic DNA, 
oligonucleotides, etc. The polynucleotides are either double-stranded or single- 
stranded, and include either, or both sense (i.e., coding) sequences and antisense (i.e., 
non-coding, complementary) sequences. The polynucleotides include the coding 
sequence of a transcription factor, or transcription factor homologue polypeptide, in 
isolation, in combination with additional coding sequences (e.g., a purification tag, a 
localization signal, as a fusion-protein, as a pre-protein, or the like), in combination 
with non-coding sequences (e.g., introns or inteins, regulatory elements such as 
promoters, enhancers, terminators, and the like), and/or in a vector or host 
environment in which the polynucleotide encoding a transcription factor or 
transcription factor homologue polypeptide is an endogenous or exogenous gene. 

A variety of methods exist for producing the polynucleotides of the invention. 
Procedures for identifying and isolating DNA clones are well known to those of skill 
in the art, and are described in, e.g., Berger and Kimmel, Guide to Molecular Cloning 
Techniques, Methods in Enzvmologv volume 152 Academic Press, Inc., San Diego, 
CA ("Berger"); Sambrook et al., Molecular Cloning - A Laboratory Manual (2nd 
Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, 1989 
("Sambrook") and Current Protocols in Molecular Biology. F. M. Ausubel et al., eds., 
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Current Protocols, a joint venture between Greene Publishing Associates, Inc. and 
John Wiley & Sons, Inc., (supplemented through 2000) ("Ausubel"). 

Alternatively, polynucleotides of the invention, can be produced by a variety 
of in vitro amplification methods adapted to the present invention by appropriate 
selection of specific or degenerate primers. Examples of protocols sufficient to direct 
persons of skill through in vitro amplification methods, including the polymerase 
chain reaction (PCR) the ligase chain reaction (LCR), Qbeta-replicase amplification 
and other RNA polymerase mediated techniques (e.g., NASBA), e.g., for the 
production of the homologous nucleic acids of the invention are found in Berger 
(supra), Sambrook (supra), and Ausubel (supra), as well as Mullis et al., (1987) PCR 
Protocols A Guide to Methods and Applications (Innis et al. eds) Academic Press Inc. 
San Diego, CA (1990) (Innis). Improved methods for cloning in vitro amplified 
nucleic acids are described in Wallace et al., U.S. Pat. No. 5,426,039. Improved 
methods for amplifying large nucleic acids by PCR are summarized in Cheng et al. 
(1994) Nature 369: 684-685 and the references cited therein, in which PCR amplicons 
of up to 40kb are generated. One of skill will appreciate that essentially any RNA can 
be converted into a double stranded DNA suitable for restriction digestion, PCR 
expansion and sequencing using reverse transcriptase and a polymerase. See, e.g., 
Ausubel, Sambrook and Berger, all supra. 

Alternatively, polynucleotides and oligonucleotides of the invention can be 
assembled from fragments produced by solid-phase synthesis methods. Typically, 
fragments of up to approximately 100 bases are individually synthesized and then 
enzymatically or chemically ligated to produce a desired sequence, e.g., a 
polynucleotide encoding all or part of a transcription factor. For example, chemical 
synthesis using the phosphoramidite method is described, e.g., by Beaucage et al. 
(1981) Tetrahedron Letters 22:1859-1869; and Matthes et al. (1984) EMBO J. 3:801- 
805. According to such methods, oligonucleotides are synthesized, purified, annealed 
to their complementary strand, ligated and then optionally cloned into suitable 
vectors. And if so desired, the polynucleotides and polypeptides of the invention can 
be custom ordered from any of a number of commercial suppliers. 

V. Homologous Sequences 
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Sequences homologous, i.e., that share significant sequence identity or 
similarity, to those provided in the Sequence Listing, derived from Arabidopsis 
thaliana or from other plants of choice are also an aspect of the invention. 
Homologous sequences can be derived from any plant including monocots and dicots 
and in particular agriculturally important plant species, including but not limited to, 
crops such as soybean, wheat, corn, potato, cotton, rice, rape, oilseed rape (including 
canola), sunflower, alfalfa, sugarcane and turf; or fruits and vegetables, such as 
banana, blackberry, blueberry, strawberry, and raspberry, cantaloupe, carrot, 
cauliflower, coffee, cucumber, eggplant, grapes, honeydew, lettuce, mango, melon, 
onion, papaya, peas, peppers, pineapple, pumpkin, spinach, squash, sweet corn, 
tobacco, tomato, watermelon, rosaceous fruits (such as apple, peach, pear, cherry and 
plum) and vegetable brassicas (such as broccoli, cabbage, cauliflower, Brussels 
sprouts, and kohlrabi). Other crops, fruits and vegetables whose phenotype can be 
changed include barley, rye, millet, sorghum, currant, avocado, citrus fruits such as 
oranges, lemons, grapefruit and tangerines, artichoke, cherries, nuts such as the 
walnut and peanut, endive, leek, roots, such as arrowroot, beet, cassava, turnip, radish, 
yam, and sweet potato, and beans. The homologous sequences may also be derived 
from woody species, such pine, poplar and eucalyptus, or mint or other labiates. 

Orthologs And Paralogs 

Several different methods are known by those of skill in the art for identifying 
and defining these functionally homologous sequences. Three general methods for 
de finin g paralogs and orthologs are described; a paralog or ortholog or homolog may 
be identified by one or more of the methods described below. 

Orthologs and paralogs are evolutionarily related genes that have similar 
sequence and similar functions. Orthologs are structurally related genes in different 
species that are derived from a speciation event. Paralogs are structurally related 
genes within a single species that are derived by a duplication event. 

Within a single plant species, gene duplication may cause two copies of a 
particular gene, giving rise to two or more genes with similar sequence and similar 
function known as paralogs. A paralog is therefore a similar gene with a similar 
function within the same species. Paralogs typically cluster together or in the same 
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clade (a group of similar genes) when a gene family phytogeny is analyzed using 
programs such as CLUSTAL (Thompson et al. (1994) Nucleic Acids Res. 22:4673- 
4680; Higgins et al. (1996) Methods Enzymol. 266 383-402). Groups of similar 
genes can also be identified with pair-wise BLAST analysis (Feng and Doolittle 
(1987) J. Mol. Evol. 25:351-360). For example, a clade of very similar MADS 
domain transcription factors from Arabidopsis all share a common function in 
flowering time (Ratcliffe et al. (2001) Plant Physiol. 126:122-132), and a group of 
very similar AP2 domain transcription factors from Arabidopsis are involved in 
tolerance of plants to freezing (Gilmour et al. (1998) Plant J. 16:433-442). Analysis 
of groups of similar genes with similar function that fall within one clade can yield 
sub-sequences that are particular to the clade. These sub-sequences, known as 
consensus sequences, can not only be used to define the sequences within each clade, 
but define the functions of these genes; genes within a clade may contain paralogous 
or orthologous sequences that share the same function. (See also, for example, Mount, 
D.W. (2001) Bioinformatics: Sequence and Genome Analysis Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, New York page 543 .) 

Speciation, the production of new species from a parental species, can also 
give rise to two or more genes with similar sequence and similar function. These 
genes, termed orthologs, often have an identical function within their host plants and 
are often interchangeable between species without losing function. Because plants 
have common ancestors, many genes in any plant species will have a corresponding 
orthologous gene in another plant species. Once a phylogenic tree for a gene family 
of one species has been constructed using a program such as CLUSTAL (Thompson 
et al. (1994) Nucleic Acids Res. 22:4673-4680; Higgins et al. (1996) Methods 
Enzymol. 266:383-402), potential orthologous sequences can placed into the 
phylogenetic tree and its relationship to genes from the species of interest can be 
determined. Once the ortholog pair has been identified, the function of the test 
ortholog can be determined by determining the function of the reference ortholog. 

Transcription factors that are homologous to the listed sequences will typically 
share at least about 30% amino acid sequence identity, or at least about 30% amino 
acid sequence identity outside of a known consensus sequence or consensus DNA- 
binding site. More closely related transcription factors can share at least about 50%, 
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about 60%, about 65%, about 70%, about 75% or about 80% or about 90% or about 
95% or about 98% or more sequence identity with the listed sequences, or with the 
listed sequences but excluding or outside a known consensus sequence or consensus 
DNA-binding site, or with the listed sequences excluding one or all conserved 
domain. Factors that are most closely related to the listed sequences share, e.g., at 
least about 85%, about 90% or about 95% or more % sequence identity to the listed 
sequences, or to the listed sequences but excluding or outside a known consensus 
sequence or consensus DNA-binding site or outside one or all conserved domain. At 
the nucleotide level, the sequences will typically share at least about 40% nucleotide 
sequence identity, preferably at least about 50%, about 60%, about 70% or about 80% 
sequence identity, and more preferably about 85%, about 90%, about 95% or about 
97% or more sequence identity to one or more of the listed sequences, or to a listed 
sequence but excluding or outside a known consensus sequence or consensus DNA- 
binding site, or outside one or all conserved domain. The degeneracy of the genetic 
code enables major variations in the nucleotide sequence of a polynucleotide while 
maintaining the amino acid sequence of the encoded protein. Conserved domains 
within a transcription factor family may exhibit a higher degree of sequence 
homology, such as at least 65% sequence identity including conservative 
substitutions, and preferably at least 80% sequence identity, and more preferably at 
least 85%, or at least about 86%, or at least about 87%, or at least about 88%, or at 
least about 90%, or at least about 95%, or at least about 98% sequence identity. 
Transcription factors that are homologous to the listed sequences should share at least 
30%, or at least about 60%, or at least about 75%, or at least about 80%, or at least 
about 90%, or at least about 95% amino acid sequence identity over the entire length 
of the polypeptide or the homolog. In addition, transcription factors that are 
homologous to the listed sequences should share at least 30%, or at least about 60%, 
or at least about 75%, or at least about 80%, or at least about 90%, or at least about 
95% amino acid sequence similarity over the entire length of the polypeptide or the 
homolog. 

Percent identity can be determined electronically, e.g., by using the 
MEGALIGN program (DNASTAR, Inc. Madison, Wis.). The MEGALIGN program 
can create alignments between two or more sequences according to different methods, 
e.g., the clustal method. (See, e.g., Higgins, D. G. and P. M. Sharp (1988) Gene 
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73:237-244.) The clustal algorithm groups sequences into clusters by examining the 
distances between all pairs. The clusters are aligned pairwise and then in groups. 
Other alignment algorithms or programs may be used, including FASTA, BLAST, or 
ENTREZ, FASTA and BLAST. These are available as a part of the GCG sequence 
analysis package (University of Wisconsin, Madison, Wis.), and can be used with or 
without default settings. ENTREZ is available through the National Center for 
Biotechnology Information. In one embodiment, the percent identity of two 
sequences can be determined by the GCG program with a gap weight of 1, e.g., each 
amino acid gap is weighted as if it were a single amino acid or nucleotide mismatch 
between the two sequences (see USPN 6,262,333). 

Other techniques for alignment are described in Methods in Enzymology, vol. 
266: Computer Methods for Macromolecular Sequence Analysis (1996), ed. Doolittle, 
Academic Press, Inc., San Diego, Calif., USA. Preferably, an alignment program that 
permits gaps in the sequence is utilized to align the sequences. The Smith- Waterman 
is one type of algorithm that permits gaps in sequence alignments. See Methods Mol. 
Biol. 70: 173-187 (1997). Also, the GAP program using the Needleman and Wunsch 
alignment method can be utilized to align sequences. An alternative search strategy 
uses MPSRCH software, which runs on a MASPAR computer. MPSRCH uses a 
Smith- Waterman algorithm to score sequences on a massively parallel computer. 
This approach improves ability to pick up distantly related matches, and is especially 
tolerant of small gaps and nucleotide sequence errors. Nucleic acid-encoded amino 
acid sequences can be used to search both protein and DNA databases. 

The percentage similarity between two polypeptide sequences, e.g., sequence 
A and sequence B, is calculated by dividing the length of sequence A, minus the 
number of gap residues in sequence A, minus the number of gap residues in sequence 
B, into the sum of the residue matches between sequence A and sequence B, times 
one hundred. Gaps of low or of no similarity between the two amino acid sequences 
are not included in determining percentage similarity. Percent identity between 
polynucleotide sequences can also be counted or calculated by other methods known 
in the art, e.g., the Jotun Hein method. (See, e.g., Hein, J. (1990) Methods EnzymoL 
183:626-645.) Identity between sequences can also be determined by other methods 
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known in the art, e.g., by varying hybridization conditions (see US Patent Application 
No. 20010010913). 

Thus, the invention provides methods for identifying a sequence similar or 
paralogous or orthologous or homologous to one or more polynucleotides as noted 
herein, or one or more target polypeptides encoded by the polynucleotides, or 
otherwise noted herein and may include linking or associating a given plant 
phenotype or gene function with a sequence. In the methods, a sequence database is 
provided (locally or across an inter or intra net) and a query is made against the 
sequence database using the relevant sequences herein and associated plant 
phenotypes or gene functions. 

In addition, one or more polynucleotide sequences or one or more 
polypeptides encoded by the polynucleotide sequences may be used to search against 
a BLOCKS (Bairoch et al. (1997) Nucleic Acids Res. 25:217-221), PFAM, and other 
databases which contain previously identified and annotated motifs, sequences and 
gene functions. Methods that search for primary sequence patterns with secondary 
structure gap penalties (Smith et al. (1992) Protein Engineering 5:35-51) as well as 
algorithms such as Basic Local Alignment Search Tool (BLAST; Altschul, S. F. 
(1993) J. Mol. Evol. 36:290-300; Altschul et al. (1990) supra), BLOCKS (Henikoff, 
S. and Henikoff, G. J. (1991) Nucleic Acids Research 19:6565-6572), Hidden Markov 
Models (HMM; Eddy, S. R. (1996) Cur. Opin. Str. BioL 6:361-365; Sonnhammer et 
al. (1997) Proteins 28:405-420), and the like, can be used to manipulate and analyze 
polynucleotide and polypeptide sequences encoded by polynucleotides. These 
databases, algorithms and other methods are well known in the art and are described 
in Ausubel et al. (1997; Short Protocols in Molecular Biology, John Wiley & Sons, 
New York N.Y., unit 7.7) and in Meyers, R. A. (1995; Molecular Biology and 
Biotechnology, Wiley VCH, New York N.Y., p 856-853). 

Furthermore, methods using manual alignment of sequences similar or 
homologous to one or more polynucleotide sequences or one or more polypeptides 
encoded by the polynucleotide sequences may be used to identify regions of similarity 
and conserved domains. Such manual methods are well-known of those of skill in the 
art and can include, for example, comparisons of tertiary structure between a 
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polypeptide sequence encoded by a polynucleotide which comprises a known function 
with a polypeptide sequence encoded by a polynucleotide sequence which has a 
function not yet determined. Such examples of tertiary structure may comprise 
predicted alpha helices, beta-sheets, amphipathic helices, leucine zipper motifs, zinc 
finger motifs, proline-rich regions, cysteine repeat motifs, and the like. 

VI. Identifying Polynucleotides or Nucleic Acids by Hybridization 

Polynucleotides homologous to the sequences illustrated in the Sequence 
Listing and tables can be identified, e.g., by hybridization to each other under 
stringent or under highly stringent conditions. Single stranded polynucleotides 
hybridize when they associate based on a variety of well characterized physical- 
chemical forces, such as hydrogen bonding, solvent exclusion, base stacking and the 
like. The stringency of a hybridization reflects the degree of sequence identity of the 
nucleic acids involved, such that the higher the stringency, the more similar are the 
two polynucleotide strands. Stringency is influenced by a variety of factors, including 
temperature, salt concentration and composition, organic and non-organic additives, 
solvents, etc. present in both the hybridization and wash solutions and incubations 
(and number thereof), as described in more detail in the references cited above. 
Encompassed by the invention are polynucleotide sequences that are capable of 
hybridizing to the claimed polynucleotide sequences, and, in particular, to those 
shown in SEQ ID NOs: 860; 802; 240; 274; 558; 24; 1120; 44; 460; 286; 120; 130; 
134; 698; 832; 580; 612; 48, and fragments thereof under various conditions of 
stringency. (See, e.g., WahL, G. M. and S. L. Berger (1987) Methods Enzymol. 
152:399-407; Kimmel, A. R. (1987) Methods Enzymol. 152:507-511.) Estimates of 
homology are provided by either DNA-DNA or DNA-RNA hybridization under 
conditions of stringency as is well understood by those skilled in the art (Hames and 
Higgins, Eds. (1985) Nucleic Acid Hybridisation, IRL Press, Oxford, U.K.). 
Stringency conditions can be adjusted to screen for moderately similar fragments, 
such as homologous sequences from distantly related organisms, to highly similar 
fragments, such as genes that duplicate functional enzymes from closely related 
organisms. Post-hybridization washes determine stringency conditions. 

In addition to the nucleotide sequences listed in Tables 4 and 5, full length 
cDNA, orthologs, paralogs and homologs of the present nucleotide sequences may be 
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identified and isolated using well known methods. The cDNA libraries orthologs, 
paralogs and homologs of the present nucleotide sequences may be screened using 
hybridization methods to determine their utility as hybridization target or 
amplification probes. 

An example of stringent hybridization conditions for hybridization of 
complementary nucleic acids which have more than 100 complementary residues on a 
filter in a Southern or northern blot is about 5°C to 20°C lower than the thermal 
melting point (Tm) for the specific sequence at a defined ionic strength and pH. The 
T m is the temperature (under defined ionic strength and pH) at which 50% of the 
target sequence hybridizes to a perfectly matched probe. Nucleic acid molecules that 
-hybridize under stringent conditions will typically hybridize to a probe based on either 
the entire cDNA or selected portions, e.g.,to a unique subsequence, of the cDNA 
under wash conditions of 0.2x SSC to 2.0 x SSC, 0.1% SDS at 50-65° C. For 
example, high stringency is about 0.2 x SSC, 0.1% SDS at 65° C. Ultra-high 
stringency will be the same conditions except the wash temperature is raised about 3 
to about 5° C, and ultra-ultra-high stringency will be the same conditions except the 
wash temperature is raised about 6 to about 9° C. For identification of less closely 
related homologues washes can be performed at a lower temperature, e.g., 50° C. In 
general, stringency is increased by raising the wash temperature and/or decreasing the 
concentration of SSC, as known in the art. 

In another example, stringent salt concentration will ordinarily be less than 
about 750 mM NaCl and 75 mM trisodium citrate, preferably less than about 500 raM 
NaCl and 50 mM trisodium citrate, and most preferably less than about 250 mM NaCl 
and 25 mM trisodium citrate. Low stringency hybridization can be obtained in the 
absence of organic solvent, e.g., formamide, while high stringency hybridization can 
be obtained in the presence of at least about 35% formamide, and most preferably at 
least about 50% formamide. Stringent temperature conditions will ordinarily include 
temperatures of at least about 30° C, more preferably of at least about 37° C, and most 
preferably of at least about 42° C. Varying additional parameters, such as 
hybridization time, the concentration of detergent, e.g., sodium dodecyl sulfate (SDS), 
and the inclusion or exclusion of carrier DNA, are well known to those skilled in the 
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art. Various levels of stringency are accomplished by combining these various 
conditions as needed. In a preferred embodiment, hybridization will occur at 30° C in 
750 mM NaCl, 75 mM trisodium citrate, and 1% SDS. In a more preferred 
embodiment, hybridization will occur at 37° C in 500 mM NaCl, 50 mM trisodium 
citrate, 1% SDS, 35% formamide, and 100 |ig/ml denatured salmon sperm DNA 
(ssDNA). In a most preferred embodiment, hybridization will occur at 42° C in 250 
mM NaCl, 25 mM trisodium citrate, 1% SDS, 50% formamide, and 200 |ug/ml 
ssDNA. Useful variations on these conditions will be readily apparent to those skilled 
in the art. 

The washing steps that follow hybridization can also vary in stringency. Wash 
stringency conditions can be defined by salt concentration and by temperature. As 
above, wash stringency can be increased by decreasing salt concentration or by 
increasing temperature. For example, stringent salt concentration for the wash steps 
will preferably be less than about 30 mM NaCl and 3 mM trisodium citrate, and most 
preferably less than about 15 mM NaCl and L5 mM trisodium citrate. Stringent 
temperature conditions for the wash steps will ordinarily include temperature of at 
least about 25° C, more preferably of at least about 42° C. Another preferred set of : 
highly stringent conditions uses two final washes in 0.1X SSC, 0.1% SDS at 65° C. 
The most preferred high stringency washes are of at least about 68° C. For example, ' 
in a preferred embodiment, wash steps will occur at 25° C in 30 mM NaCl, 3 mM 
trisodium citrate, and 0.1% SDS. In a more preferred embodiment, wash steps will . 
occur at 42° C in 15 mM NaCl, 1.5 mM trisodium citrate, and 0.1% SDS. In a most 
preferred embodiment, the wash steps will occur at 68° C in 15 mM NaCl, 1.5 mM 
trisodium citrate, and 0. 1% SDS. Additional variations on these conditions will be 
readily apparent to those skilled in the art (see U.S. Patent Application No. 
20010010913). 

As another example, stringent conditions can be selected such that an 
oligonucleotide that is perfectly complementary to the coding oligonucleotide 
hybridizes to the coding oligonucleotide with at least about a 5-1 Ox higher signal to 
noise ratio than the ratio for hybridization of the perfectly complementary 
oligonucleotide to a nucleic acid encoding a transcription factor known as of the filing 
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date of the application. Conditions can be selected such that a higher signal to noise 
ratio is observed in the particular assay which is used, e.g., about 15x, 25x, 35x, 50x 
or more. Accordingly, the subject nucleic acid hybridizes to the unique coding 
oligonucleotide with at least a 2x higher signal to noise ratio as compared to 
hybridization of the coding oligonucleotide to a nucleic acid encoding known 
polypeptide. Again, higher signal to noise ratios can be selected, e.g., about 5x, lOx, 
25x, 35x, 50x or more. The particular signal will depend on the label used in the 
relevant assay, e.g., a fluorescent label, a colorimetric label, a radioactive label, or the 
like. 

Alternatively, transcription factor homolog polypeptides can be obtained by 
screening an expression library using antibodies specific for one or more transcription 
factors. With the provision herein of the disclosed transcription factor, and 
transcription factor homologue nucleic acid sequences, the.encoded polypeptide(s) 
can be expressed and purified in a heterologous expression system (e.g., E. colt) and 
used to raise antibodies (monoclonal or polyclonal) specific for the polypeptide(s) in 
question. Antibodies can also be raised against synthetic peptides derived from 
transcription factor, or transcription factor homologue, amino acid sequences. 
Methods of raising antibodies are well known in the art and are described in Harlow 
and Lane (1988) Antibodies: A Laboratory Manual Cold Spring Harbor Laboratory, 
New York. Such antibodies can then be used to screen an expression library 
produced from the plant from which it is desired to clone additional transcription 
factor homologues, using the methods described above. The selected cDNAs can be 
confirmed by sequencing and enzymatic activity. 

VII. Sequence Variations 

It will readily be appreciated by those of skill in the art, that any of a variety of 
polynucleotide sequences are capable of encoding the transcription factors and 
transcription factor homologue polypeptides of the invention. Due to the degeneracy 
of the genetic code, many different polynucleotides can encode identical and/or 
substantially similar polypeptides in addition to those sequences illustrated in the 
Sequence Listing. Nucleic acids having a sequence that differs from the sequences 
shown in the Sequence Listing, or complementary sequences, that encode functionally 
equivalent peptides (i.e., peptides having some degree of equivalent or similar 
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biological activity) but differ in sequence from the sequence shown in the sequence 
listing due to degeneracy in the genetic code, are also within the scope of the 
invention. 

Altered polynucleotide sequences encoding polypeptides include those 
sequences with deletions, insertions, or substitutions of different nucleotides, resulting 
in a polynucleotide encoding a polypeptide with at least one functional characteristic 
of the instant polypeptides. Included within this definition are polymorphisms which 
may or may not be readily detectable using a particular oligonucleotide probe of the 
polynucleotide encoding the instant polypeptides, and improper or unexpected 
hybridization to allelic variants, with a locus other than the normal chromosomal 
locus for the polynucleotide sequence encoding the instant polypeptides. 

Allelic variant refers to any of two or more alternative forms of a gene 
occupying the same chromosomal locus. Allelic variation arises naturally through 
mutation, and may result in phenotypic polymorphism within populations. Gene 
mutations can be silent (i.e., no change in the encoded polypeptide) or may encode 
polypeptides having altered amino acid sequence. The term allelic variant is also used 
herein to denote a protein encoded by an allelic variant of a gene. Splice variant refers 
to alternative forms of RNA transcribed from a gene. Splice variation arises naturally 
through use of alternative splicing sites within a transcribed RNA molecule, or less 
commonly between separately transcribed RNA molecules, and may result in several 
mRNAs transcribed from the same gene. Splice variants may encode polypeptides 
having altered amino acid sequence. The term splice variant is also used herein to 
denote a protein encoded by a splice variant of an mRNA transcribed from a gene. 

Those skilled in the art would recognize that the polypeptide sequence G681, 
SEQ ID NO: 580, represents a single transcription factor; allelic variation and 
alternative splicing may be expected to occur. Allelic variants of the polypeptide 
sequence of SEQ ID NO: 579 can be cloned by probing cDNA or genomic libraries 
from different individual organisms according to standard procedures. Allelic 
variants of the DNA sequence shown in SEQ ID NO: 579, including those containing 
silent mutations and those in which mutations result in amino acid sequence changes, 
are within the scope of the present invention, as are proteins which are allelic variants 
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of SEQ ID NO: 580. cDNAs generated from alternatively spliced mRNAs, which 
retain the properties of the transcription factor are included within the scope of the 
present invention, as are polypeptides encoded by such cDNAs and mRNAs. Allelic 
variants and splice variants of these sequences can be cloned by probing cDNA or 
genomic libraries from different individual organisms or tissues according to standard 
procedures known in the art (see USPN 6,388,064). 

For example, Table 1 illustrates, e.g., that the codons AGC, AGT, TCA, TCC, 
TCG, and TCT all encode the same amino acid: serine. Accordingly, at each position 
in the sequence where there is a codon encoding serine, any of the above trinucleotide 
sequences can be used without altering the encoded polypeptide. 
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Table 1 



Amino acid 


Possible Codons 


Alanine 


Ala 


A 


GCA 


GCC 


GCG 


GCU 






Cysteine 


Cys 


C 


TGC 


TGT 










Aspartic acid 


Asp 


D 


GAC 


GAT 










Glutamic acid Glu 


E 


GAA 


GAG 










Phenylalanine Phe 


F 


TTC 


TTT 










Glycine 


Gly 


G 


GGA 


GGC 


GGG 


GGT 






Histidine 


His 


H 


CAC 


CAT 










Isoleucine 


lie 


I 


ATA 


ATC 


ATT 








Lysine 


Lys 


K 


AAA 


AAG 










Leucine 


Leu 


L 


TTA 


TTG 


CTA 


CTC 


CTG 


CTT 


Methionine 


Met 


M 


ATG 












Asparagine 


Asn 


N 


AAC 


AAT 










Proline 


Pro 


P 


CCA 


ccc 


CCG 


CCT 






Glutamine 


Gin 


Q 


CAA 


CAG 










Arginine 


Arg 


R 


AGA 


AGG 


CGA 


CGC 


CGG 


CGT 


Serine 


Ser 


S 


AGC 


AGT 


TCA 


TCC 


TCG 


TCT 


Threonine 


Thr 


T 


ACA 


ACC 


ACG 


ACT 






Valine 


Val 


V 


GTA 


GTC 


GTG 


GTT 






Tryptophan 


Tip 


W 


TGG 












Tyrosine 


Tyr 


Y 


TAC 


TAT 











Sequence alterations that do not change the amino acid sequence encoded by 
the polynucleotide are termed "silent" variations. With the exception of the codons 
ATG and TGG, encoding methionine and tryptophan, respectively, any of the possible 
codons for the same amino acid can be substituted by a variety of techniques, e.g., 
site-directed mutagenesis, available in the art. Accordingly, any and all such 
variations of a sequence selected from the above table are a feature of the invention. 
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In addition to silent variations, other conservative variations that alter one, or a 
few amino acids in the encoded polypeptide, can be made without altering the 
function of the polypeptide, these conservative variants are, likewise, a feature of the 
invention. 

For example, substitutions, deletions and insertions introduced into the 
sequences provided in the Sequence Listing are also envisioned by the invention. 
Such sequence modifications can be engineered into a sequence by site-directed 
mutagenesis (Wu (ed.) Meth. Enzvmol . (1993) vol. 217, Academic Press) or the other 
methods noted below. Amino acid substitutions are typically of single residues; 
insertions usually will be on the order of about from 1 to 10 amino acid residues; and 
deletions will range about from 1 to 30 residues. In preferred embodiments, deletions 
or insertions are made in adjacent pairs, e.g., a deletion of two residues or insertion of 
two residues. Substitutions, deletions, insertions or any combination thereof can be 
combined to arrive at a sequence. The mutations that are made in the polynucleotide 
encoding the transcription factor should not place the sequence out of reading frame 
and should not create complementary regions that could produce secondary mRNA 
structure. Preferably, the polypeptide encoded by the DNA performs the desired 
function. 

Conservative substitutions are those in which at least one residue in the amino 
acid sequence has been removed and a different residue inserted in its place. Such 
substitutions generally are made in accordance with the Table 2 when it is desired to 
maintain the activity of the protein. Table 2 shows amino acids which can be 
substituted for an amino acid in a protein and which are typically regarded as 
conservative substitutions. 
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Table 2 



Residue 


Conservative 




Substitutions 


Ala 


Ser 


Arg 


Lys 


Asn 


Gin; His 


Asp 


Glu 


Gin 


Asn 


Cys 


Ser 


Glu 


Asp 


Gly 


Pro 


His 


Asn; Gin 


lie 


Leu, Val 


Leu 


He; Val 


Lys 


Arg; Gin 


Met 


Leu; He 


Phe 


Met; Leu; Tyr 


Ser 


Thr; Gly 


Thr 


Ser, Val 


Trp 


Tyr 


Tyr 


Trp; Phe 


Val 


lie; Leu 



Similar substitutions are those in which at least one residue in the amino acid 
sequence has been removed and a different residue inserted in its place. Such 
substitutions generally are made in accordance with the Table 3 when it is desired to 
maintain the activity of the protein. Table 3 shows amino acids which can be 
substituted for an amino acid in a protein and which are typically regarded as 
structural and functional substitutions. For example, a residue in column 1 of Table 3 



37 



WO 03/013227 PCT/US02/25805 

may be substituted with residue in column 2; in addition, a residue in column 2 of 
Table 3 may be substituted with the residue of column 1 . 



Table 3 


Residue 


Similar Substitutions 


Ala 


Ser; Thr; Gly, Val; Leu; He 


Arg 


Lys; His; Gly 


Asn 


Gin; His; Gly, Ser; Thr 


Asp 


Glu, Ser; Thr 


Gin 


Asn; Ala 


Cys 


Ser; Gly 


Glu 


Asp 


Gly 


Pro; Arg 


His 


Asn; Gin; Tyr; Phe; Lys; Arg 


He 


Ala; Leu; Val; Gly; Met 


Leu 


Ala; He; Val; Gly, Met 


Lys 


Arg; His; Gin; Gly; Pro 


Met 


Leu; lie; Phe 


Phe 


Met; Leu; Tyr; Trp; His; Val; 




Ala 


Ser 


Thr; Gly; Asp; Ala; Val; He; His 


Thr 


Ser; Val; Ala; Gly 


Trp 


Tyr; Phe; His 


Tyr 


Trp; Phe; His 


Val 


Ala; He; Leu; Gly; Thr; Ser; Glu 



Substitutions that are less conservative than those in Table 2 can be selected 
by picking residues that differ more significantly in their effect on maintaining (a) the 
structure of the polypeptide backbone in the area of the substitution, for example, as a 
sheet or helical conformation, (b) the charge or hydrophobicity of the molecule at the 
target site, or (c) the bulk of the side chain. The substitutions which in general are 
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expected to produce the greatest changes in protein properties will be those in which 
(a) a hydrophilic residue, e.g., seryl or threonyl, is substituted for (or by) a 
hydrophobic residue, e.g., leucyl, isoleucyl, phenylalanyl, valyl or alanyl; (b) a 
cysteine or proline is substituted for (or by) any other residue; (c) a residue having an 
electropositive side chain, e.g., lysyl, arginyl, or histidyl, is substituted for (or by) an 
electronegative residue, e.g., glutamyl or aspartyl; or (d) a residue having a bulky side 
chain, e.g., phenylalanine, is substituted for (or by) one not having a side chain, e.g., 
glycine. 

Vm. Further Modifying Sequences of the Invention - Mutation/Forced 
Evolution 

In addition to generating silent or conservative substitutions as noted, above, 
the present invention optionally includes methods of modifying the sequences of the 
Sequence Listing. In the methods, nucleic acid or protein modification methods are 
used to alter the given sequences to produce new sequences and/or to chemically or 
enzymatically modify given sequences to change the properties of the nucleic acids or 
proteins. 

Thus, in one embodiment, given nucleic acid sequences are modified, e.g., 
according to standard mutagenesis or artificial evolution methods to produce modified 
sequences. The modified sequences may be created using purified natural 
polynucleotides isolated from any organism or may be synthesized from purified 
compositions and chemicals using chemical means well know to those of skill in the 
art. For example, Ausubel, supra, provides additional details on mutagenesis 
methods. Artificial forced evolution methods are described, for example, by Stemmer 
(1994) Nature 370:389-391, Stemmer (1994) Proc. Natl. Acad. Sci. USA 91:10747- 
10751, and U.S. Patents 5,811,238, 5,837,500, and 6,242,568. Methods for 
engineering synthetic transcription factors and other polypeptides are described, for 
example, by Zhang et al. (2000) J. Biol. Chem. 275:33850-33860, Liu et al. (2001) J. 
Biol. Chem. 276: 1 1323-1 1334, and Isalan et al. (2001) Nature Biotechnol. 19:656- 
660. Many other mutation and evolution methods are also available and expected to 
be within the skill of the practitioner. 
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Similarly, chemical or enzymatic alteration of expressed nucleic acids and 
polypeptides can be performed by standard methods. For example, sequence can be 
modified by addition of lipids, sugars, peptides, organic or inorganic compounds, by 
the inclusion of modified nucleotides or amino acids, or the like. For example, 
protein modification techniques are illustrated in Ausubel, supra. Further details on 
chemical and enzymatic modifications can be found herein. These modification 
methods can be used to modify any given sequence, or to modify any sequence 
produced by the various mutation and artificial evolution modification methods noted 
herein. 

Accordingly, the invention provides for modification of any given nucleic acid 
by mutation, evolution, chemical or enzymatic modification, or other available 
methods, as well as for the products produced by practicing such methods, e.g., using 
the sequences herein as a starting substrate for the various modification approaches. 

For example, optimized coding sequence containing codons preferred by a 
particular prokaryotic or eukaryotic host can be used e.g., to increase the rate of 
translation or to produce recombinant RNA transcripts having desirable properties, 
such as a longer half-life, as compared with transcripts produced using a non- 
optimized sequence. Translation stop codons can also be modified to reflect host 
preference. For example, preferred stop codons for Saccharomyces cerevisiae and 
mammals are TAA and TGA, respectively. The preferred stop codon for 
monocotyledonous plants is TGA, whereas insects and E. coli prefer to use TAA as 
the stop codon. 

The polynucleotide sequences of the present invention can also be engineered 
in order to alter a coding sequence for a variety of reasons, including but not limited 
to, alterations which modify the sequence to facilitate cloning, processing and/or 
expression of the gene product. For example, alterations are optionally introduced 
using techniques which are well known in the art, e.g., site-directed mutagenesis, to 
insert new restriction sites, to alter glycosylation patterns, to change codon preference, 
to introduce splice sites, etc. 
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Furthermore, a fragment or domain derived from any of the polypeptides of 
the invention can be combined with domains derived from other transcription factors 
or synthetic domains to modify the biological activity of a transcription factor. For 
instance, a DNA-binding domain derived from a transcription factor of the invention 
can be combined with the activation domain of another transcription factor or with a 
synthetic activation domain. A transcription activation domain assists in initiating 
transcription from a DNA-binding site. Examples include the transcription activation 
region of VP16 or GAL4 (Moore et al. (1998) Proc. Natl. Acad. Sci. USA 95: 376- 
381; and Aoyama et al. (1995) Plant Cell 7:1773-1785), peptides derived from 
bacterial sequences (Ma and Ptashne (1987) Cell 51; 113-119) and synthetic peptides 
(Giniger and Ptashne, (1987) Nature 330:670-672). 

DC Expression and Modification of Polypeptides 

Typically, polynucleotide sequences of the invention are incorporated into 
recombinant DNA (or RNA) molecules that direct expression of polypeptides of the 
invention in appropriate host cells, transgenic plants, in vitro translation systems, or 
the like. Due to the inherent degeneracy of the genetic code, nucleic acid sequences 
which encode substantially the same or a functionally equivalent amino acid sequence 
can be substituted for any listed sequence to provide for cloning and expressing the 
relevant homologue. 

X. Vectors, Promoters, and Expression Systems 

The present invention includes recombinant constructs comprising one or 
more of the nucleic acid sequences herein. The constructs typically comprise a 
vector, such as a plasmid, a cosmid, a phage, a virus (e.g., a plant virus), a bacterial 
artificial chromosome (BAC), a yeast artificial chromosome (YAC), or the like, into 
which a nucleic acid sequence of the invention has been inserted, in a forward or 
reverse orientation. In a preferred aspect of this embodiment, the construct further 
comprises regulatory sequences, including, for example, a promoter, operably linked 
to the sequence. Large numbers of suitable vectors and promoters are known to those 
of skill in the art, and are commercially available. 

General texts that describe molecular biological techniques useful herein, 
including the use and production of vectors, promoters and many other relevant 
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topics, include Berger, Sambrook and Ausubel, supra. Any of the identified sequences 
can be incoiporated into a cassette or vector, e.g., for expression in plants. A number of 
expression vectors suitable for stable transformation of plant cells or for the 
establishment of transgenic plants have been described including those described in 
Weissbach and Weissbach, (1989) Methods for Plant Molecular Biology. Academic 
Press, and Gelvin et al., (1990) Plant Molecular Biology Manual, Kluwer Academic 
Publishers. Specific examples include those derived from a Ti plasmid of 
Agrobacterium tumefaciens, as well as those disclosed by Herrera-Estrella et al. 
(1983) Nature 303: 209, Bevan (1984) Nucl Acid Res. 12: 8711-8721, Klee (1985) 
Bio/Technology 3: 637-642, for dicotyledonous plants. 

Alternatively, non-Ti vectors can be used to transfer the DNA into 
monocotyledonous plants and cells by using free DNA delivery techniques. Such 
methods can involve, for example, the use of liposomes, electroporation, 
microprojectile bombardment, silicon carbide whiskers, and viruses. By using these 
methods transgenic plants such as wheat, rice (Christou (1991) Bio/Technology 9: 
957-962) and corn (Gordon-Kamm (1990) Plant Cell 2: 603-618) can be produced. 
An immature embryo can also be a good target tissue for monocots for direct DNA 
delivery techniques by using the particle gun (Weeks et al. (1993) Plant Physiol 102: 
1077-1084; Vasil (1993) Bio/Technology 10: 667-674; Wan and Lemeaux (1994) 
Plant Physiol 104: 37-48, and for Agrobacterium-medmted DNA transfer (Ishida et al. 
(1996) Nature Biotech 14: 745-750). 

Typically, plant transformation vectors include one or more cloned plant 
coding sequence (genomic or cDNA) under the transcriptional control of 5 ? and 3' 
regulatory sequences and a dominant selectable marker. Such plant transformation 
vectors typically also contain a promoter (e.g., a regulatory region controlling 
inducible or constitutive, environmentally-or developmentally-regulated, or cell- or 
tissue-specific expression), a transcription initiation start site, an RNA processing 
signal (such as intron splice sites), a transcription termination site, and/or a 
polyadenylation signal. 

Examples of constitutive plant promoters which can be useful for expressing 
the TF sequence include: the cauliflower mosaic virus (CaMV) 35S promoter, which 
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confers constitutive, high-level expression in most plant tissues (see, e.g., Odell et al. 
(1985) Nature 313:810-812); the nopaline synthase promoter (An et al. (1988) Plant 
Physiol 88:547-552); and the octopine synthase promoter (Fromm et al. (1989) Plant 
Cell 1:977-984). 

A variety of plant gene promoters that regulate gene expression in response to 
environmental, hormonal, chemical, developmental signals, and in a tissue-active 
manner can be used for expression of a TF sequence in plants. Choice of a promoter 
is based largely on the phenotype of interest and is determined by such factors as 
tissue (e.g., seed, fruit, root, pollen, vascular tissue, flower, carpel, etc.), inducibility 
(e.g., in response to wounding, heat, cold, drought, light, pathogens, etc.), timing, 
developmental stage, and the like. Numerous known promoters have been 
characterized and can favorably be employed to promote expression of a 
polynucleotide of the invention in a transgenic plant or cell of interest. For example, 
tissue specific promoters include: seed-specific promoters (such as the napin, 
phaseolin or DC3 promoter described in US Pat. No. 5,773,697), fruit-specific 
promoters that are active during fruit ripening (such as the dru 1 promoter (US Pat. 
No. 5,783,393), or the 2A1 1 promoter (US Pat. No. 4,943,674) and the tomato 
polygalacturonase promoter (Bird et al. (1988) Plant Mol Biol 1 1:651), root-specific 
promoters, such as those disclosed in US Patent Nos. 5,618,988, 5,837,848 and 
5,905,186, pollen-active promoters such as PTA29, PTA26 and PTA13 (US Pat. No. 
5,792,929), promoters active in vascular tissue (Ringli and Keller (1998) Plant Mol 
Biol 37:977-988), flower-specific (Kaiser et al, (1995) Plant Mol Biol 28:231-243), 
pollen (Baerson et al. (1994) Plant Mol Biol 26:1947-1959), carpels (Ohl et al. (1990) 
Plant Cell 2:837-848), pollen and ovules (Baerson et al. (1993) Plant Mol Biol 
22:255-267), auxin-inducible promoters (such as that described in van der Kop et al. 
(1999) Plant Mol Biol 39:979-990 or Baumann et al. (1999) Plant Cell 1 1 :323-334), 
cytokinin-inducible promoter (Guevara-Garcia (1998) Plant Mol Biol 38:743-753), 
promoters responsive to gibberellin (Shi et al. (1998) Plant Mol Biol 38:1053-1060, 
Willmott et al. (1998) 38:817-825) and the like. Additional promoters are those that 
elicit expression in response to heat (Ainley et al. (1993) Plant Mol Biol 22: 13-23), 
light (e.g., the pea rbcS-3A promoter, Kuhlemeier et al. (1989) Plant Cell 1:471, and 
the maize rbcS promoter, Schaffiier and Sheen (1991) Plant Cell 3: 997); wounding 
(e.g., wunl, Siebertz et al. (1989) Plant Cell 1: 961); pathogens (such as the PR-1 
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promoter described in Buchel et al. (1999) Plant Mol. Biol. 40:387-396, and the 
PDF1.2 promoter described in Manners et al. (1998) Plant Mol. Biol. 38:1071-80), 
and chemicals such as methyl jasmonate or salicylic acid (Gatz et al. (1997) Plant Mol 
Biol 48: 89-108). In addition, the timing of the expression can be controlled by using 
promoters such as those acting at senescence (An and Amazon (1995) Science 270: 
1986-1988); or late seed development (Odell et al. (1994) Plant Physiol 106:447-458). 

Plant expression vectors can also include RNA processing signals that can be 
positioned within, upstream or downstream of the coding sequence. In addition, the 
expression vectors can include additional regulatory sequences from the 3- 
untranslated region of plant genes, e.g., a 3 f terminator region to increase mRNA 
stability of the mRNA, such as the PI-II terminator region of potato or the octopine or 
nopaline synthase 3 ! terminator regions. 

Additional Expression Elements 

Specific initiation signals can aid in efficient translation of coding sequences. 
These signals can include, e.g., the ATG initiation codon and adjacent sequences. In 
cases where a coding sequence, its initiation codon and upstream sequences are 

■ inserted into the appropriate expression vector, no additional translational control 
signals may be needed. However, in cases where only coding sequence (e.g., a 

. mature protein coding sequence), or a portion thereof, is inserted, exogenous 

transcriptional control signals including the ATG initiation codon can be separately 
provided. The initiation codon is provided in the correct reading frame to facilitate 
transcription. Exogenous transcriptional elements and initiation codons can be of 
various origins, both natural and synthetic. The efficiency of expression can be 
enhanced by the inclusion of enhancers appropriate to the cell system in use. 

Expression Hosts 

The present invention also relates to host cells which are transduced with 
vectors of the invention, and the production of polypeptides of the invention 
(including fragments thereof) by recombinant techniques. Host cells are genetically 
engineered (i.e., nucleic acids are introduced, e.g., transduced, transformed or 
transfected) with the vectors of this invention, which may be, for example, a cloning 
vector or an expression vector comprising the relevant nucleic acids herein. The 
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vector is optionally a plasmid, a viral particle, a phage, a naked nucleic acid, etc. The 
engineered host cells can be cultured in conventional nutrient media modified as 
appropriate for activating promoters, selecting transformants, or amplifying the 
relevant gene. The culture conditions, such as temperature, pH and the like, are those 
previously used with the host cell selected for expression, and will be apparent to 
those skilled in the art and in the references cited herein, including, Sambrook and 
Ausubel. 

The host cell can be a eukaryotic cell, such as a yeast cell, or a plant cell, or 
the host cell can be a prokaryotic cell, such as a bacterial cell. Plant protoplasts are 
also suitable for some applications. For example, the DNA fragments are introduced 
into plant tissues, cultured plant cells or plant protoplasts by standard methods 
including electroporation (Fromm et al.* (1985) Proc. Natl. Acad. Sci. USA 82, 5824, 
infection by viral vectors such as cauliflower mosaic virus (CaMV) (Hohn et aL, 
(1982) Molecular Biology of Plant Tumors. (Academic Press, New York) pp. 549- 
560; US 4,407,956), high velocity ballistic penetration by small particles with the 
nucleic acid either within the matrix of small beads or particles, or on the surface 
(Klein et aL, (1987) Nature 327, 70-73), use of pollen as vector (WO 85/01856), or 
use of Agrobacterium tumefaciens or A. rhizogenes carrying a T-DNA plasmid in 
which DNA fragments are cloned. The T-DNA plasmid is transmitted to plant cells 
upon infection by Agrobacterium tumefaciens, and a portion is stably integrated into 
the plant genome (Horsch et al. (1984) Science 2 33:496-498; Fraley et al. (1983) 
Proc. Natl. Acad. Sci USA 80, 4803). 

The cell can include a nucleic acid of the invention which encodes a 
polypeptide, wherein the cells expresses a polypeptide of the invention. The cell can 
also include vector sequences, or the like. Furthermore, cells and transgenic plants 
that include any polypeptide or nucleic acid above or throughout this specification, 
e.g., produced by transduction of a vector of the invention, are an additional feature of 
the invention. 

For long-term, high-yield production of recombinant proteins, stable 
expression can be used. Host cells transformed with a nucleotide sequence encoding 
a polypeptide of the invention are optionally cultured under conditions suitable for the 
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expression and recovery of the encoded protein from cell culture. The protein or 
fragment thereof produced by a recombinant cell may be secreted, membrane-bound, 
or contained intracellularly, depending on the sequence and/or the vector used. As 
will be understood by those of skill in the art, expression vectors containing 
polynucleotides encoding mature proteins of the invention can be designed with 
signal sequences which direct secretion of the mature polypeptides through a 
prokaryotic or eukaryotic cell membrane. 

XI. Modified Amino Acid Residues 

Polypeptides of the invention may contain one or more modified amino acid 
residues. The presence of modified amino acids may be advantageous in, for 
example, increasing polypeptide half-life, reducing polypeptide antigenicity or 
toxicity, increasing polypeptide storage stability, or the like. Amino acid residue(s) 
are modified, for example, co-translationally or post-translationally during 
recombinant production or modified by synthetic or chemical means. 

Non-limiting examples of a modified amino acid residue include incorporation 
or other use of acetylated amino acids, glycosylated amino acids, sulfated amino 
acids, prenylated (e.g., farnesylated, geranylgeranylated) amino acids, PEG modified 
(e.g., 'TEGylated") amino acids, biotinylated amino acids, carboxylated amino acids, 
phosphorylated amino acids, etc. References adequate to guide one of skill in the 
modification of amino acid residues are replete throughout the literature. 

The modified amino acid residues may prevent or increase affinity of the 
polypeptide for another molecule, including, but not limited to, polynucleotide, 
proteins, carbohydrates, lipids and lipid derivatives, and other organic or synthetic 
compounds. 

XII. Identification of Additional Factors 

A transcription factor provided by the present invention can also be used to 
identify additional endogenous or exogenous molecules that can affect a phentoype or 
trait of interest. On the one hand, such molecules include organic (small or large 
molecules) and/or inorganic compounds that affect expression of (i.e., regulate) a 
particular transcription factor. Alternatively, such molecules include endogenous 
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molecules that are acted upon either at a transcriptional level by a transcription factor 
of the invention to modify a phenotype as desired. For example, the transcription 
factors can be employed to identify one or more downstream gene with which is 
subject to a regulatory effect of the transcription factor. In one approach, a 
transcription factor or transcription factor homologue of the invention is expressed in 
a host cell, e.g., a transgenic plant cell, tissue or explant, and expression products, 
either RNA or protein, of likely or random targets are monitored, e.g., by 
hybridization to a microarray of nucleic acid probes corresponding to genes expressed 
in a tissue or cell type of interest, by two-dimensional gel electrophoresis of protein 
products, or by any other method known in the art for assessing expression of gene . 
products at the level of RNA or protein. Alternatively, a transcription factor of the 
invention can be used to identify promoter sequences (i.e., binding sites) involved in 
the regulation of a downstream target. After identifying a promoter sequence, 
interactions between the transcription factor and the promoter sequence can be 
modified by changing specific nucleotides in the promoter sequence or specific amino 
acids in the transcription factor that interact with the promoter sequence to alter a 
plant trait. Typically, transcription factor DNA-binding sites are identified by gel 
shift assays. After identifying the promoter regions, the promoter region sequences 
can be employed in double-stranded DNA arrays to identify molecules that affect the 
interactions of the transcription factors with their promoters (Bulyk et al. (1999) 
; Nature Biotechnology 17:573-577). 

The identified transcription factors are also useful to identify proteins that 
modify the activity of the transcription factor. Such modification can occur by 
covalent modification, such as by phosphorylation, or by protein-protein (homo or- 
heteropolymer) interactions. Any method suitable for detecting protein-protein 
interactions can be employed. Among the methods that can be employed are co- 
immunoprecipitation, cross-linking and co-purification through gradients or 
chromatographic columns, and the two-hybrid yeast system. 

The two-hybrid system detects protein interactions in vivo and is described in 
Chien et al. ((1991), Proc. Natl. Acad. Sci. USA 88:9578-9582) and is commercially 
available from Clontech (Palo Alto, Calif.). In such a system, plasmids are 
constructed that encode two hybrid proteins: one consists of the DNA-binding domain 
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of a transcription activator protein fused to the TF polypeptide and the other consists 
of the transcription activator protein ! s activation domain fused to an unknown protein 
that is encoded by a cDNA that has been recombined into the plasmid as part of a 
cDNA library. The DNA-binding domain fusion plasmid and the cDNA library are 
transformed into a strain of the yeast Saccharomyces cerevisiae that contains a 
reporter gene (e.g., lacZ) whose regulatory region contains the transcription activator's 
binding site. Either hybrid protein alone cannot activate transcription of the reporter 
gene. Interaction of the two hybrid proteins reconstitutes the functional activator 
protein and results in expression of the reporter gene, which is detected by an assay 
for the reporter gene product. Then, the library plasmids responsible for reporter gene 
expression are isolated and sequenced to identify the proteins encoded by the library 
plasmids. After identifying proteins that interact with the transcription factors, assays 
for compounds that interfere with the TF protein-protein interactions can be 
preformed. 

XIIL Identification of Modulators 

In addition to the intracellular molecules described above, extracellular 
molecules that alter activity or expression of a transcription factor, either directly or 
indirectly, can be identified. For example, the methods can entail first placing a 
candidate molecule in contact with a plant or plant cell. The molecule can be 
introduced by topical administration, such as spraying or soaking of a plant, and then 
the molecule's effect on the expression or activity of the TF polypeptide or the 
expression of the polynucleotide monitored. Changes in the expression of the TF 
polypeptide can be monitored by use of polyclonal or monoclonal antibodies, gel 
electrophoresis or the like. Changes in the expression of the corresponding 
polynucleotide sequence can be detected by use of microarrays, Northerns, 
quantitative PCR, or any other technique for monitoring changes in mRNA 
expression. These techniques are exemplified in Ausubel et al. (eds) Current 
Protocols in Molecular Biology, John Wiley & Sons (1998, and supplements through 
2001). Such changes in the expression levels can be correlated with modified plant 
traits and thus identified molecules can be useful for soaking or spraying on fruit, 
vegetable and grain crops to modify traits in plants. 
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Essentially any available composition can be tested for modulatory activity of 
expression or activity of any nucleic acid or polypeptide herein. Thus, available 
libraries of compounds such as chemicals, polypeptides, nucleic acids and the like can 
be tested for modulatory activity. Often, potential modulator compounds can be 
dissolved in aqueous or organic (e.g., DMSO-based) solutions for easy delivery to the 
cell or plant of interest in which the activity of the modulator is to be tested. 
Optionally, the assays are designed to screen large modulator composition libraries by 
automating the assay steps and providing compounds from any convenient source to 
assays, which are typically run in parallel (e.g., in microtiter formats on microliter 
plates in robotic assays). 

In one embodiment, high throughput screening methods involve providing a 
combinatorial library containing a large number of potential compounds (potential 
modulator compounds). Such "combinatorial chemical libraries" are then screened in 
one or more assays, as described herein, to identify those library members (particular 
chemical species or subclasses) that display a desired characteristic activity. The 
compounds thus identified can serve as target compounds. 

A combinatorial chemical library can be, e.g., a collection of diverse chemical 
compounds generated by chemical synthesis or biological synthesis. For example, a 
combinatorial chemical library such as a polypeptide library is formed by combining a 
set of chemical building blocks (e.g., in one example, amino acids) in every possible 
way for a given compound length (i.e., the number of amino acids in a polypeptide 
compound of a set length). Exemplary libraries include peptide libraries, nucleic acid 
libraries, antibody libraries (see, e.g., Vaughn et al. (1996) Nature Biotechnology. 
14(3):309-314 and PCT/US96/10287), carbohydrate libraries (see, e.g., Liang et al. 
Science (1996) 274:1520-1522 and U.S. Patent 5,593,853), peptide nucleic acid 
libraries (see, e.g., U.S. Patent 5,539,083), and small organic molecule libraries (see, 
e.g., benzodiazepines, Baum C&EN Jan 18, page 33 (1993); isoprenoids, U.S. Patent 
5,569,588; thiazolidinones and metathiazanones, U.S. Patent 5,549,974; pyrrolidines, 
U.S. Patents 5,525,735 and 5,519,134; morpholino compounds, U.S. Patent 
. 5,506,337) and the like. 
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Preparation and screening of combinatorial or other libraries is well known to 
those of skill in the art. Such combinatorial chemical libraries include, but are not 
limited to, peptide libraries (see, e.g., U.S. Patent 5,010,175; Furka, (1991) Int. J. 
Pent. Prot. Res. 37:487-493; and Houghton et al. (1991) Nature 354:84-88). Other 
chemistries for generating chemical diversity libraries can also be used. 

In addition, as noted, compound screening equipment for high-throughput 
screening is generally available, e.g., using any of a number of well known robotic 
systems that have also been developed for solution phase chemistries useful in assay 
systems. These systems include automated workstations including an automated 
synthesis apparatus and robotic systems utilizing robotic arms. Any of the above 
devices are suitable for use with the present invention, e.g., for high-throughput 
screening of potential modulators. The nature and implementation of modifications to 
these devices (if any) so that they can operate as discussed herein will be apparent to 
persons skilled in the relevant art. 

Indeed, entire high throughput screening systems are commercially available. 
These systems typically automate entire procedures including all sample and reagent 
pipetting, liquid dispensing, timed incubations, and final readings of the microplate in 
detector(s) appropriate for the assay. These configurable systems provide high 
throughput and rapid start up as well as a high degree of flexibility and customization. 
Similarly, microfluidic implementations of screening are also commercially available. 

The manufacturers of such systems provide detailed protocols the various high 
throughput. Thus, for example, Zymark Corp. provides technical bulletins describing 
screening systems for detecting the modulation of gene transcription, ligand binding, 
and the like. The integrated systems herein, in addition to providing for sequence 
alignment and, optionally, synthesis of relevant nucleic acids, can include such 
screening apparatus to identify modulators that have an effect on one or more 
polynucleotides or polypeptides according to the present invention. 

In some assays it is desirable to have positive controls to ensure that the 
components of the assays are working properly. At least two types of positive 
controls are appropriate. That is, known transcriptional activators or inhibitors can be 
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incubated with cells/plants/ etc. in one sample of the assay, and the resulting 
increase/decrease in transcription can be detected by measuring the resulting increase 
in RNA/ protein expression, etc., according to the methods herein. It will be 
appreciated that modulators can also be combined with transcriptional activators or 
inhibitors to find modulators that inhibit transcriptional activation or transcriptional 
repression. Either expression of the nucleic acids and proteins herein or any 
additional nucleic acids or proteins activated by the nucleic acids or proteins herein, 
or both* can be monitored. 

In an embodiment, the invention provides a method for identifying 
compositions that modulate the activity or expression of a polynucleotide or 
polypeptide of the invention. For example, a test compound, whether a small or large 
molecule, is placed in contact with a cell, plant (or plant tissue or explant), or 
composition comprising the polynucleotide or polypeptide of interest and a resulting 
effect on the cell, plant, (or tissue or explant) or composition is evaluated by 
monitoring, either directly or indirectly, one or more of: expression level of the 
polynucleotide or polypeptide, activity (or modulation of the activity) of the 
polynucleotide or polypeptide. In some cases, an alteration in a plant phenotype can 
be detected following contact of a plant (or plant cell, or tissue or explant) with the • 
putative modulator, e.g., by modulation of expression or activity of a polynucleotide 
or polypeptide of the invention. Modulation of expression or activity of a 
polynucleotide or polypeptide of the invention may also be caused by molecular 
elements in a signal transduction second messenger pathway and such modulation can 
affect similar elements in the same or another signal transduction second messenger 
pathway. 

XIV. Subsequences 

Also contemplated are uses of polynucleotides, also referred to herein as 
oligonucleotides, typically having at least 12 bases, preferably at least 15, more 
preferably at least 20, 30, or 50 bases, which hybridize under at least highly stringent 
(or ultra-high stringent or ultra-ultra-high stringent conditions) conditions to a 
polynucleotide sequence described above. The polynucleotides may be used as 
probes, primers, sense and antisense agents, and the like, according to methods as 
noted supra. 
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Subsequences of the polynucleotides of the invention, including 
polynucleotide fragments and oligonucleotides are useful as nucleic acid probes and 
primers. An oligonucleotide suitable for use as a probe or primer is at least about 15 
nucleotides in length, more often at least about 18 nucleotides, often at least about 21 
nucleotides, frequently at least about 30 nucleotides, or about 40 nucleotides, or more 
in length. A nucleic acid probe is useful in hybridization protocols, e.g., to identify 
additional polypeptide homologues of the invention, including protocols for 
microarray experiments. Primers can be annealed to a complementary target DNA 
strand by nucleic acid hybridization to form a hybrid between the primer and the 
target DNA strand, and then extended along the target DNA strand by a DNA 
polymerase enzyme. Primer pairs can be used for amplification of a nucleic acid 
sequence, «.g., by the polymerase chain reaction (PCR) or other nucleic-acid 
amplification methods. See Sambrook and Ausubel, supra. 

In addition, the invention includes an isolated or recombinant polypeptide 
including a subsequence of at least about 15 contiguous amino acids encoded by the 
recombinant or isolated polynucleotides of the invention. For example, such 
polypeptides, or domains or fragments thereof, can be used as immunogens, e.g., to 
produce antibodies specific for the polypeptide sequence, or as probes for detecting a 
sequence of interest. A subsequence can range in size from about 1 5 amino acids in 
length up to and including the full length of the polypeptide. 

To be encompassed by the present invention, an expressed polypeptide which 
comprises such a polypeptide subsequence performs at least one biological function 
of the intact polypeptide in substantially the same manner, or to a similar extent, as 
does the intact polypeptide. For example, a polypeptide fragment can comprise a 
recognizable structural motif or functional domain such as a DNA binding domain 
that binds to a specific DNA promoter region, an activation domain or a domain for 
protein-protein interactions. 
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XV. Production of Transgenic Plants 

Modification of Traits 

The polynucleotides of the invention are favorably employed to produce 
transgenic plants with various traits, or characteristics, that have been modified in a 
desirable manner, e.g., to improve the seed characteristics of a plant. For example, 
alteration of expression levels or patterns (e.g., spatial or temporal expression 
patterns) of one or more of the transcription factors (or transcription factor 
homologues) of the invention, as compared with the levels of the same protein found 
in a wild type plant, can be used to modify a plant's traits. An illustrative example of 
trait modification, improved characteristics, by altering expression levels of a 
particular transcription factor is described further in the Examples and the Sequence 
Listing. 

Arabidopsis as a model system 

Arabidopsis thaliana is the object of rapidly growing attention as a model for 
genetics and metabolism in plants. Arabidopsis has a small genome, and well 
documented studies are available. It is easy to grow in large numbers and mutants 
defining important genetically controlled mechanisms are either available, or can 
readily be obtained. Various methods to introduce and express isolated homologous 
genes are available (see Koncz, et aL, eds. Methods in Arabidopsis Research, et al. 
(1992), World Scientific, New Jersey, New Jersey, in 'Treface"). Because of its small 
size, short life cycle, obligate autogamy and high fertility, Arabidopsis is also a 
choice organism for the isolation of mutants and studies in morphogenetic and 
development pathways, and control of these pathways by transcription factors (Koncz, 
supra, p. 72). A number of studies introducing transcription factors into A. thaliana 
have demonstrated the utility of this plant for understanding the mechanisms of gene 
regulation and trait alteration in plants. See, for example, Koncz, supra, and U.S. 
Patent Number 6,417,428). 

Arabidopsis genes in transgenic plants. 

Expression of genes which encode transcription factors modify expression of 
endogenous genes, polynucleotides, and proteins are well known in the art. In 
addition, transgenic plants comprising isolated polynucleotides encoding transcription 
factors may also modify expression of endogenous genes, polynucleotides, and 
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proteins. Examples include Peng et al. (1997, Genes and Development 1 1 :3194- 
3205) and Peng et al. (1999, Nature, 400:256-261). In addition, many others have 
demonstrated that an Arabidopsis transcription factor expressed in an exogenous plant 
species elicits the same or very similar phenotypic response. See, for example, Fu et 
al. (2001, Plant Cell 13:1791-1802); Nandi et al. (2000, Curr. Biol. 10:215-218); 
Coupland (1995, Nature 377:482-483); and Weigel and Nilsson (1995, Nature 
377:482-500). 

Homologous genes introduced into transgenic plants. 

Homologous genes that maybe derived from any plant, or from any source 
whether natural, synthetic, semi-synthetic or recombinant, and that share significant 
sequence identity or similarity to those provided by the present invention, may be 
introduced into plants, for example, crop plants, to confer desirable or improved traits. 
Consequently, transgenic plants may be produced that comprise a recombinant 
expression vector or cassette with a promoter operably linked to one or more 
sequences homologous to presently disclosed sequences. The promoter may be, for. 

» 

example, a plant or viral promoter. 

The invention thus provides for methods for preparing transgenic plants, and 
for modifying plant traits. These methods include introducing into a plant a 
recombinant expression vector or cassette comprising a functional promoter operably 
linked to one or more sequences homologous to presently disclosed sequences. Plants 
and kits for producing these plants that result from the application of these methods 
are also encompassed by the present invention. 

The complete descriptions of the traits associated with each polynucleotide of 
the invention is fully disclosed in Table 4, Table 5, and Table 6. 
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hypothetical protein. I 
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Traits of interest 

Examples of some of the traits that may be desirable in plants, and that may be 
provided by transforming the plants with the presently disclosed sequences, are listed 
in Table 6. 



Table 6. Genes, traits and utilities that affect plant characteristics 



Trait Category 


Traits 


Transcription factor genes that 
impact traits 


Utility 
Gene effect on: 




Resistance and 
tolerance 


Salt stress resistance 


G22; G196;G226; G303; 
G312; G325; G353; G482; 
G545; G801;G867; G884; 
G922; G926; G1452; G1794; 
G1820; G1836; G1843; G1863; 
G2053; G2110; G2140; G2153; 
G2379; G2701; G2713; G2719; 
G2789 


Germination rate, 
survivability, 
yield; extended 
growth range 




Osmotic stress 
resistance 


G47; G175;G188; G303; 
G325; G353; G489; G502; 
G526; G921; G922; G926; 
G1069; G1089; G1452; G1794; 
G1930; G2140; G2153; G2379; 
G2701;G2719;G2789; 


Germination rate, 
survivability, yield 




Cold stress resistance; 
cold germination 


G256; G394; 

G664;G864;G1322; G2130 


Germination, 
growth, earlier 
planting 




Tolerance to freezing 


G303; G325; G353; G720; 
G912; G913; G1794; G2053; 
G2140; G2153; G2379; G2701; 
G2719; G2789 


Survivability, 
yield, appearance, 
extended range 




Heat stress resistance 


G3; G464; G682; G864; G964; 


Germination, 
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31305; G1645; G2130 G2430 


growth, later 
slanting 




Drought, low 
bumidity resistance 


G303; G325; G353; G720; ! 
G912; G926; G1452; G1794; ; 
G1820; G1843; G2053; G2140; i 
G2153; G2379; G2583; G2701; 
G2719; G2789 


Survivability, 
yield, extended 
range 




Radiation resistance 


G1052 


Survivability, 
vigor, appearance 




Decreased herbicide 
sensitivity 


G343; G2133; G2517 


Resistant to 
increased 
herbicide use 




Increased herbicide 
sensitivity 


G374; G877;G1519 


Use as a herbicide 
target 




Oxidative stress 


G477; G789; G1807; G2133; 
G2517 


Improved yield, 
appearance, 
reduced 
senescence 




Light response 


G183; G354; G375; G1062; 
G1322; G1331; G1488; G1494; 
G1521; G1786; G1794; G2144; 
G2555; 


Germination, 
growth, 
development, 
flowering time 




Development, 
morphology 


Overall plant 
architecture 


G24; G27; G31; G33; G47 
G147; G156; G160; G182; 
G187; G195; G196;G211; 
G221; G237; G280; G342; 
G352; G357; G358; G360; 
G362; G364; G365; G367; 
G373; G377; G396; G431; 
G447; G479; G546; G546; 
G551;G578; G580; G596: 
G615;G617; G620; G625 2 




Vascular tissues, 
lignin content; cell 
wall content; 
appearance 
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G638; G658; G716; G725; 
G727; G730; G740; G770; 
G858; G865; G869; G872; 
G904; G910; G912; G920; 
G939; G963; G977; G979; 
G987; G988; G993; G1007; 



G1010; 


G1014; 


G1035; 


G1046; 


G1049; 


G1062; 


G1069; 


G1070; 


G1076; 


G1089; 


G1093; 


G1127; 


G1131; 


G1145; 


G1229; 


G1246; 


G1304; 


G1318; 


G1320; 


G1330; 


G1331; 


G1352; 


G1354; 


G1360; 


G1364; 


G1379; 


G1384; 


G1399; 


G1415; 


G1417; 


G1442; 


G1453; 


G1454; 


G1459; 


G1460; 


G1471; 


G1475; 


G1477; 


G1487; 


G1487; 


G1492; 


G1499; 


G1499; 


G1531; 


G1540; 


G1543; 


G1543; 


G1544; 


G1548; 


G1584; 


G1587; 


G1588; 


G1589; 


G1636; 


G1642; 


G1747; 


G1749; 


G1749; 


G1751; 


G1752; 


G1763; 


G1766; 


G1767; 


G1778; 


G1789; 


G1790; 


G1791; 


G1793; 


G1794; 


G1795; 


G1800; 


G1806; 


G1811; 


G1835; 


G1836; 


G1838; 


G1839; 


G1843; 


G1853; 


G1855; 


G1865 : 


G1881; 


G1882. 


G1883; 


G1884 : 


,G1891 : 


, G1896 


, G1898; 


G1902 


, G1904 


, G1906 


; G1913; 


G1914 


, G1925 


, G1929 


; G1930; 


G1954 


; G1958 


; G1965 


; G1976; 


G2057 


; G2107 


; G2133 


; G2134; 


G2151 


; G2154 


; G2157 


; G2181; 
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G2290; G2299; G2340; G2340; 
G2346; G2373; G2376; G2424; 
G2465; G2505; G2509; G2512; 
G2513; G2519; G2520; G2533; 
G2534; G2573; G2589; G2687; 
G2720; G2787; G2789; G2893 






Size: increased stature 


G189; G1073; G1435; G2430 






Size: reduced stature 
or dwarfism 


G3; G5; G21; G23; G39; G165; 
G184; G194; G258; G280; 
G340; G343; G353; G354; 
G362; G363; G370; G385; 
G396; G439; G440; G447; 
G450; G550; G557; G599; 
G636; G652; G670; G671; 
G674; G729; G760; G804; 
G831;G864;G884;G898; 
G900; G912; G913; G922; 
G932; G937; G939; G960; 
G962; G977; G991; G1000; 
G1008; G1020; G1023; G1053; 
G1067; G1075; G1137; G1181; 
G1198; G1228; G1266; G1267; 
G1275; G1277; G1309; G1311; 
G1314; G1317; G1322; G1323; 
G1326; G1332; G1334; G1367; 
G1381; G1382; G1386; G1421; 
G1488; G1494; G1537; G1545; 
G1560; G1586; G1641; G1652; 
G1655; G1671; G1750; G1756; 
G1757; G1782; G1786; G1794; 
G1839; G1845; G1879; G1886; 
G1888; G1933; G1939; G1943; 
G1944; G2011; G2094; G2115; 


Ornamental; small 
stature provides 
wind resistance; 
creation of dwarf 
varieties 
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32130; G2132; G2144; G2145; 
G2147; G2156; G2294; G2313; 
G2344; G2431; G2510; G2517; 
G2521;G2893;G2893 






Fruit size and number 


G362 


Biomass, yield, 
cotton boll fiber 
density 




Flower structure, 
inflorescence 


G47; G259; G353; G354; 
G671; G732; G988; G1000; 
G1063; G1140; G1326; G1449; 
G1543; G1560; G1587; G1645; 
G1947; G2108; G2143; G2893 


Ornamental 
lorticulture; 
production of 
saffron or other 
edible flowers 




Number and 
development of 
trichomes 


G225; G226; G247; G362; 
G585; G634; G676; G682; 
G1014; G1332; G1452; G1795; 
G2105 


Resistance to pests 
and desiccation; 
essential oil 
production 




Seed size, color, and 
number 


G156;G450; G584; G652; 
G668; G858; G979; G1040; 
G1062; G1145; G1255; G1494; 
G1531; G1534; G1594; G2105; 
G2114; 


Yield 




Root development, 
modifications 


G9; G1482; G1534; G1794; 
G1852; G2053; G2136; G2140 






Modifications to root 
hairs 


G225; G226 


Nutrient, water 
uptake, pathogen 
resistance 




Apical dominance 


G559; G732; G1255; G1275; 
G1411; G1488; G1635; G2452; 
G2509 


Ornamental 
horticulture 




Branching patterns 


G568; G988; G1548 


Ornamental 
horticulture, knot 
reduction, 
improved 
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windscreen 




Leaf shape, color, 
modifications 


G375; G377; G428; G438; 
G447; G464; G557; G577; 
G599; G635; G671; G674; 
G736; G804; G903; G977; 
G921; G922; G1038; G10< 
G1067; G1073; G1075; G: 
G1152; G1198; G1267; G 
G1452; G1484; G1586; G 
G1767; G1786; G1792; G 
G2059; G2094; G2105; G' 
G2117;G2143;G2144;G: 
G2452; G2465; G2587; G 
G2724; 


53; 

1146; 

1269; 

1594; 

1886; 

2113; 

2431; 

2583; 


Appealing shape 
or shiny leaves for 
ornamental 
agriculture, 
increased biomass 
or photosynthesis 




Silique 


Gil 34 


Ornamental 




Stem morphology 


G47; G438; G671; G748; 
G988; G1000 


Ornamental; 
digestibility 




Shoot modifications 


G390; G391 


Ornamental stem 
bifurcations 




Disease, 

Pathogen 

Resistance 


Bacterial 


G211; G347; G367; G418; 
G525; G545; G578; G1049 


Yield, appearance, 
survivability, 
extended range 




Fungal 


G19; G28; G28; G28; GU 
G188; G207; G211; G237 
G248; G278; G347; G367 
G371; G378; G409; G477 
G545; G545; G558; G569 
G578; G591; G594; G616 
G789; G805; G812; G865 
G869; G872; G881;G896 
G940; G1047; G1049; Gl 
G1084; G1196; G1255; G 


064; 
1266; 


Yield, appearance, 
survivability, 
extended range 
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G1363; G1514; G1756; G1792; 
G1792; G1792; G1792; G1880; 
G1919; G1919; G1927; G1927; 
G1936; G1936; G1950; G2069; 
G2130; G2380; G2380; G2555 






Nutrients 


Increased tolerance to 
nitrogen-limited soils 


G225; G226; G1792 






Increased tolerance to 

phosphate-limited 

soils 


G419; G545; G561; G1946 






Increased tolerance to 

potassium-limited 

soils 


G561;G911 






Hormonal 


Hormone sensitivity 


G12; G546; G926; G760; 
G913; G926; G1062; G1069; 
G1095; G1134; G1330; G1452; 
G1666; G1820; G2140; G2789 


Seed dormancy, 
drought tolerance; 
plant form, fruit 
ripening 




Seed 

biochemistry 


Production of seed 
prenyl lipids, 
including tocopherol 


G214; G259; G490; G652; 
G748; G883; G1052; G1328; 
G1930; G2509; G2520 


Antioxidant 
activity, vitamin E 




Production of seed 
sterols 


G20 


Precursors for 
human steroid 
hormones; 
cholesterol 
modulators 




Production of seed 
glucosinolates 


G353; G484; G674; G1272; 
G1506; G1897; G1946; G2113; 
G2117; G2155; G2290; G2340 


Defense against 
insects; putative 
anticancer 
activity; 
undesirable in 
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animal feeds 




Modified seed oil 
content 


G162; G162; G180; G192; 
G241; G265; G286; G291; 
G427; G509; G519; G561; 
G567; G590; G818; G849; 
G892; G961; G974; G1063; 
G1143; Gil 90; G1198; G1226; 
G1229; G1323; G1451; G1471; 
G1478; G1496; G1526; G1543; 
G1640; G1644; G1646; G1672; 
G1677; G1750; G1765; G1777; 
G1793; G1838; G1902; G1946; 
G1948; G2059; G2123; G2138; 
G2139; G2343; G2792; G2830 


Vegetable oil 
production; 
increased caloric 
value for animal 
feeds; lutein 
content 




Modified seed oil 
composition 


G217; G504; G622; G778; 
G791;G861;G869; G938; 
G965; G1417; G2192 


Heat stability, 
digestibility of 
seed. oils 




Modified seed protein 
content 


G162; G226; G241; G371; 
G427; G509; G567; G597; 
G732; G849; G865; G892; 
G963; G988; G1323; G1323; 
G1419; G1478; G1488; G1634; 
G1637; G1641; G1644; G1652; 
G1677; G1777; G1777; G1818; 
G1820; G1903; G1909; G1946; 
G1946; G1958; G2059; G2117; 
G2417; G2509 


Reduced caloric 
value for humans 










Leaf 

biochemistry 


Production of 
flavonoids 


G1666* 


Ornamental 
pigment 
production; 
pathogen 
resistance; health 
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benefits 




Production of leaf 
glucosinolates 


G264; G353; G484; G652; 
G674; G681; G1069; G1198; 
G1322; G1421; G1657; G1794; 
G1897; G1946; G2115; G2117; 
G2144; G2155; G2155; G2340; 
G2512; G2520; G2552 


Defense against 
insects; putative 
anticancer 
activity; 
undesirable in 
animal feeds 




Production of 
diterpenes 


G229 


Induction of 
enzymes involved 
in alkaloid 
biosynthesis 




Production of 
anthocyanin 


G546 


Ornamental 
pigment 




Production of leaf 
phytosterols, inc. 
stigmastanol, 
campesterol 


G561;G2131;G2424 


Precursors for 
human steroid 
hormones; 
cholesterol 
modulators 




Leaf fatty acid 
composition 


G214; G377; G861;G962; 
G975; G987; G1266; G1337; 
G1399; G1465; G1512; G2136; 
G2147; G2192 


Nutritional value; 
increase in waxes 
for disease 
resistance 




Production of leaf 
prenyl lipids, 
including tocopherol 


G214; G259; G280; G652; 
G987; G1543; G2509; G2520 


Antioxidant 
activity, vitamin E 




Biochemistry, 
general 


Production of 
miscellaneous 
secondary metabolites 


G229; G663 






Sugar, starch, 
hemicellulose 
composition, 


G158; G211;G211;G237; 
G242; G274; G598; G1012; 
G1266; G1309; G1309; G1641; 
G1765; G1865; G2094; G2094; 


Food digestibility, 
hemicellulose & 
pectin content; 
fiber content; plant 
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G2589; G2589 1 

] 


tensile strength, 
wood quality, 
pathogen 
resistance, pulp 
production; tuber 
starch content 




Sugar sensing 


Plant response to 
sugars 


G26; G38; G43; G207; G218; 
G241; G254; G263; G308; 
G536; G567; G567; G680; 
G867; G912; G956; G996; 
G1068; G1225; G1314; G1314; 
G1337; G1759; G1804; G2153; 
G2379 


Photosynthetic 

rate, carbohydrate 

accumulation, 

biomass 

production, 

source-sink 

relationships, 

senescence 




Growth, 
Reproduction 


Plant growth rate and 
development 


G447; G617; G674; G730; 
G917; G937; G1035; G1046; 
G1131; G1425; G1452; G1459; 
G1492; G1589; G1652; G1879; 
G1943; G2430; G2431; G2465; 
G2521 


Faster growth, 
increased biomass 
or yield, improved 
appearance; delay 
in bolting 




Embryo development 


G167 






Seed germination rate 


G979;G1792;G2130 


Yield 




Plant, seedling vigor 


G561;G2346 


Survivability, 
yield 




Senescence; cell death 


G571; G636; G878; G1050; 
G1463; G1749; G1944; G2130; 
G2155;G2340;G2383 


Yield, appearance; 
response to 
pathogens; 




Modified fertility 


G39; G340; G439; G470; 
G559; G615; G652; G671; 
G779; G962; G977; G988; 
G1000; G1063; G1067; G1075; 


Prevents or 
minimizes escape 
of the pollen of 
GMOs 
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G1266; G1311; G1321; G1326; 
G1367; G1386; G1421; G1453; 
G1471; G1453; G1560; G1594; 
G1635; G1750; G1947; G2011; 
G2094; G2113; G2115; G2130; 
G2143; G2147; G2294; G2510; 
G2893 






Early flowering 


G147; G157; G180; G183; 
G183; G184; G185; G208; 
G227; G294; G390; G390; 
G390; G391; G391; G427; 
G427; G490; G565; G590; 
G592; G720; G789; G865; 
G898; G898; G989; G989; 
G1037; G1037; G1142; G1225; 
G1225; G1226; G1242; G1305; 
G1305; G1380; G1380; G1480; 
G1480; G1488; G1494; G1545; 
G1545; G1649; G1706; G1760; 
G1767; G1767; G1820; G1841; 
G1841; G1842; G1843; G1843; 
G1946; G1946; G2010; G2030; 
G2030; G2144; G2144; G2295; 
G2295; G2347; G2348; G2348; 
G2373; G2373; G2509; G2509; 
G2555; G2555 


Faster generation 
time; synchrony of 
flowering; 
potential for 
introducing new 
traits to single 
variety 




Delayed flowering 


G8; G47; G192; G214; G234; 
G361; G362; G562; G568; 
G571; G591; G680; G736; 
G748; G859; G878; G910; 
G912; G913; G971; G994; 
G1051; G1052; G1073; G1079; 
G1335; G1435; G1452; G1478; 


Delayed time to 
pollen production 
ofGMO plants; 
synchrony of 
flowering; 
increased yield 
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G1789; G1804; G1865; G1865; 
G1895; G1900; G2007; G2133; 
G2155; G2291; G2465 






Extended flowering 
phase 


G1947 






Flower and leaf 
development 


G259; G353; G377; G580; 
G638 G652; G858; G869; 
G917; G922; G932; G1063; 
G1075; GU40; G1425; G1452; 
G1499; G1548; G1645; G1865; 
G1897; G1933; G2094; G2124; 
G2140; G2143; G2535; G2557 


Ornamental 
applications; 
decreased fertility 




Flower abscission 


G1897 


Ornamental: 
longer retention of 
flowers 



* When co-expressed with G669 and G663 



Significance of modified plant traits 

Currently, the existence of a series of maturity groups for different latitudes 
represents a major barrier to the introduction of new valuable traits. Any trait (e.g. 
disease resistance) has to be bred into each of the different maturity grpups separately, 
a laborious and costly exercise. The availability of single strain, which could be 
grown at any latitude, would therefore greatly increase the potential for introducing 
new traits to crop species such as soybean and cotton. 

For many of the traits, listed in Table 6 and below, that may be conferred to 
plants, a single transcription factor gene may be used to increase or decrease, advance 
or delay, or improve or prove deleterious to a given trait. For example, 
overexpression of a transcription factor gene that naturally occurs in a plant may 
cause early flowering relative to non-transformed or wild-type plants. By knocking 
out the gene, or suppressing the gene (with, for example, antisense suppression) the 
plant may experience delayed flowering. Similarly, overexpressing or suppressing 
one or more genes can impart significant differences in production of plant products, 
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such as different fatty acid ratios. Thus, suppressing a gene that causes a plant to be 
more sensitive to cold may improve a plant's tolerance of cold. 

Salt stress resistance . Soil salinity is one of the more important variables that 
determines where a plant may thrive. Salinity is especially important for the 
successful cultivation of crop plants, particular in many parts of the world that have 
naturally high soil salt concentrations, or where the soil has been over-utilized. Thus, 
presently disclosed transcription factor genes that provide increased salt tolerance 
during germination, the seedling stage, and throughout a plant's life cycle would find 
particular value for imparting survivability and yield in areas where a particular crop 
would not normally prosper. 

Osmotic stress resistance. Presently disclosed transcription factor genes that 
confer resistance to osmotic stress may increase germination rate under adverse 
conditions, which could impact survivability and yield of seeds and plants. 

Cold stress resistance. The potential utility of presently disclosed transcription 
factor genes that increase tolerance to cold is to confer better germination and growth 
in cold conditions. The germination of many crops is very sensitive to cold 
temperatures. Genes that would allow germination and seedling vigor in the cold 
would have highly significant utility in allowing seeds to be planted earlier in the 
season with a high rate of survivability. Transcription factor genes that confer better 
survivability in cooler climates allow a grower to move up planting time in the spring 
and extend the growing season further into autumn for higher crop yields. 

Tolerance to freezing . The presently disclosed transcription factor genes that 
impart tolerance to freezing conditions are useful for enhancing the survivability and 
appearance of plants conditions or conditions that would otherwise cause extensive 
cellular damage. Thus, germination of seeds and survival may take place at 
temperatures significantly below that of the mean temperature required for 
germination of seeds and survival of non-transformed plants. As with salt tolerance, 
this has the added benefit of increasing the potential range of a crop plant into regions 
in which it would otherwise succumb. Cold tolerant transformed plants may also be 
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planted earlier in the spring or later in autumn, with greater success than with non- 
transformed plants. 

Heat stress tolerance . The germination of many crops is also sensitive to high 
temperatures. Presently disclosed transcription factor genes that provide increased 
heat tolerance are generally useful in producing plants that germinate and grow in hot 
conditions, may find particular use for crops that are planted late in the season, or 
extend the range of a plant by allowing growth in relatively hot climates. 

Drought, low humidity tolerance . Strategies that allow plants to survive in 
low water conditions may include, for example, reduced surface area or surface oil or 
wax production. A number of presently disclosed transcription factor genes increase 
a plant's tolerance to low water conditions and provide the benefits of improved 
survivability, increased yield and an extended geographic and temporal planting 
range. 

Radiation resistance . Presently disclosed transcription factor genes have been 
shown to increase lutein production. Lutein, like other xanthophylls such as 
zeaxanthin and violaxanthin, are important in the protection of plants against the 
damaging effects of excessive light. Lutein contributes, directly or indirectly, to the 
rapid rise of non-photochemical quenching in plants exposed to high light. Increased 1 
tolerance of field plants to visible and ultraviolet light impacts survivability and vigor, • 
particularly for recent transplants. Also affected are the yield and appearance of 
harvested plants or plant parts. Crop plants engineered with presently disclosed 
transcription factor genes that cause the plant to produce higher levels of lutein 
therefore would have improved photoprotection, leading to less oxidative damage and 
increase vigor, survivability and higher yields under high light and ultraviolet light 
conditions. 

Decreased herbicide sensitivity. Presently disclosed transcription factor genes 
that confer resistance or tolerance to herbicides (e.g., glyphosate) may find use in 
providing means to increase herbicide applications without detriment to desirable 
plants. This would allow for the increased use of a particular herbicide in a local 
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environment, with the effect of increased detriment to undesirable species and less 
harm to transgenic, desirable cultivars. 



Increased herbicide sensitivity . Knockouts of a number of the presently 
disclosed transcription factor genes have been shown to be lethal to developing 
embryos. Thus, these genes are potentially useful as herbicide targets. 

Oxidative stress . In plants, as in all living things, abiotic and biotic stresses 
induce the formation of oxygen radicals, including superoxide and peroxide radicals. 
This has the effect of accelerating senescence, particularly in leaves, with the resulting 
loss of yield and adverse effect on appearance. Generally, plants that have the highest 
level of defense mechanisms, such as, for example, polyunsaturated moieties of 
membrane lipids, are most likely to thrive under conditions that introduce oxidative 
stress (e.g., high light, ozone, water deficit, particularly in combination). Introduction 
of the presently disclosed transcription factor genes that increase the level of oxidative 
stress defense mechanisms would provide beneficial effects on the yield and 
appearance of plants. One specific oxidizing agent, ozone, has been shown to cause 
significant foliar injury, which impacts yield and appearance of crop and ornamental 
plants. In addition to reduced foliar injury that would be found in ozone resistant 
plant created by transforming plants with some of the presently disclosed transcription 
factor genes, the latter have also been shown to have increased chlorophyll 
fluorescence (Yu-Sen Chang et al. Bot. Bull. Acad. Sin. (2001) 42: 265-272). 

Heavy metal tolerance . Heavy metals such as lead, mercury, arsenic, 
chromium and others may have a significant adverse impact on plant respiration. 
Plants that have been transformed with presently disclosed transcription factor genes 
that confer improved resistance to heavy metals, through, for example, sequestering or 
reduced uptake of the metals will show improved vigor and yield in soils with 
relatively high concentrations of these elements. Conversely, transgenic transcription 
factors may also be introduced into plants to confer an increase in heavy metal uptake, 
which may benefit efforts to clean up contaminated soils. 

Light response . Presently disclosed transcription factor genes that modify a 
plant's response to light may be useful for modifying a plant's growth or 
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development, for example, photomoiphogenesis in poor light, or accelerating 
flowering time in response to various light intensities, quality or duration to which a 
non-transformed plant would not similarly respond. Examples of such responses that 
have heen demonstrated include leaf number and arrangement, and early flower bud 
appearances. 

Overall plant architecture . Several presently disclosed transcription factor 
genes have been introduced into plants to alter numerous aspects of the plant's 
morphology. For example, it has been demonstrated that a number of transcription 
factors may be used to manipulate branching, such as the means to modify lateral 
branching, a possible application in the forestry industry. Transgenic plants have also 
been produced that have altered cell wall content, lignin production, flower organ 
number, or overall shape of the plants. Presently disclosed transcription factor genes 
transformed into plants may be used to affect plant morphology by increasing or 
decreasing internode distance, both of which may be advantageous under different 
circumstances. For example, for fast growth of woody plants to provide more 
biomass, or fewer knots, increased internode distances are generally desirable. For 
improved wind screening of shrubs or trees, or harvesting characteristics of, for 
example, members of the Gramineae family, decreased internode distance may be 
advantageous. These modifications would also prove useful in the ornamental 
horticulture industry for the creation of unique phenotypic characteristics of 
ornamental plants. 

Increased stature . For some ornamental plants, the ability to provide larger 
varieties may be highly desirable. For many plants, including t fruit-bearing trees or 
trees and shrubs that serve as view or wind screens, increased stature provides 
obvious benefits. Crop species may also produce higher yields on larger cultivars. 

Reduced stature or dwarfism . Presently disclosed transcription factor genes 
that decrease plant stature can be used to produce plants that are more resistant to 
damage by wind and rain, or more resistant to heat or low humidity or water deficit. 
Dwarf plants are also of significant interest to the ornamental horticulture industry, 
and particularly for home garden applications for which space availability may be 
limited. 
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Fruit size and number . Introduction of presently disclosed transcription factor 
genes that affect fruit size will have desirable impacts on fruit size and number, which 
may comprise increases in yield for fruit crops, or reduced fruit yield, such as when 
vegetative growth is preferred (e.g., with bushy ornamentals, or where fruit is 
undesirable, as with ornamental olive trees). 

Flower structure, inflorescence, and development. P resently disclosed 
transgenic transcription factors have been used to create plants with larger flowers or 
arrangements of flowers that are distinct from wild-type or non-transformed cultivars. 
This would likely have the most value for the ornamental horticulture industry, where 
larger flowers or interesting presentations generally are preferred and command the 
highest prices. Flower structure may have advantageous effects on fertility, and could . 
be used, for example, to decrease fertility by the absence, reduction or screening of 
reproductive components. One interesting application for manipulation of flower 
structure, for example, by introduced transcription factors could be in the increased 
production of edible flowers or flower parts, including saffron, which is derived from 
the stigmas of Crocus sativus. 

Number and development of trichomes . Several presently disclosed 
transcription factor genes have been used to modify trichome number and amount of 
trichome products in plants. Trichome glands on the surface of many higher plants 
produce and secrete exudates that give protection from the elements and pests such as 
insects, microbes and herbivores. These exudates may physically immobilize insects 
and spores, may be insecticidal or ant-microbial or they may act as allergens or 
irritants to protect against herbivores. Trichomes have also been suggested to decrease 
transpiration by decreasing leaf surface air flow, and by exuding chemicals that 
protect the leaf from the sun. 

Seed size, color and number . The introduction of presently disclosed 
transcription factor genes into plants that alter the size or number of seeds may have a 
significant impact on yield, both when the product is the seed itself, or when biomass 
of the vegetative portion of the plant is increased by reducing seed production. In the 
case of fruit products, it is often advantageous to modify a plant to have reduced size 
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or number of seeds relative to non-transformed plants to provide seedless or varieties 
with reduced numbers or smaller seeds. Presently disclosed transcription factor genes 
have also been shown to affect seed size, including the development of larger seeds. 
Seed size, in addition to seed coat integrity, thickness and permeability, seed water 
content and by a number of other components including antioxidants and 
oligosaccharides, may affect seed longevity in storage. This would be an important 
utility when the seed of a plant is the harvested crops, as with, for example, peas, 
beans, nuts, etc. Presently disclosed transcription factor genes have also been used to 
modify seed color, which could provide added appeal to a seed product. 

Root development modifications . By modifying the structure or development 
of roots by transforming into a plant one or more of the presently disclosed 
transcription factor genes, plants may be produced that have the capacity to thrive in 
otherwise unproductive soils. For example, grape roots that extend further into rocky 
soils, or that remain viable in waterlogged soils, would increase the effective planting 
range of the crop. It may be advantageous to manipulate a plant to produce short 
roots, as when a soil in which the plant will be growing is occasionally flooded, or 
when pathogenic fungi or disease-causing nematodes are prevalent. 

Modifications to root hairs . Presently disclosed transcription factor genes that 
increase root hair length or number potentially could be used to increase root growth 
or vigor, which might in turn allow better plant growth under adverse conditions such 
as limited nutrient or water availability. 

Apical dominance . The modified expression of presently disclosed 
transcription factors that control apical dominance could be used in ornamental 
horticulture, for example, to modify plant architecture. 

Branching patterns . Several presently disclosed transcription factor genes have 
been used to manipulate branching, which could provide benefits in the forestry 
industry. For example, reduction in the formation of lateral branches could reduce 
knot formation. Conversely, increasing the number of lateral branches could provide 
utility when a plant is used as a windscreen, or may also provide ornamental 
advantages. 
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Leaf shape, color and modifications . It has been demonstrated in laboratory 
experiments that overexpression of some of the presently disclosed transcription 
factors produced marked effects on leaf development. At early stages of growth, these 
transgenic seedlings developed narrow, upward pointing leaves with long petioles, 
possibly indicating a disruption in circadian-clock controlled processes or nyctinastic 
movements. Other transcription factor genes can be used to increase plant biomass; 
large size would be useful in crops where the vegetative portion of the plant is the 
marketable portion. 

Siliques . Genes that later silique conformation in brassicates may be used to 
modify fruit ripening processes in brassicates and other plants, which may positively 
affect seed or fruit quality. 

Stem morphology and shoot modifications . Laboratory studies have 
demonstrated that introducing several of the presently disclosed transcription factor 
genes into plants can cause stem bifurcations in shoots, in which the shoot meristems 
split to form two or three separate shoots. This unique appearance would be desirable 
in ornamental applications. 

Diseases, pathogens and pests . A number of the presently disclosed 
transcription factor genes have been shown to or are likely to confer resistance to 
various plant diseases, pathogens and pests. The offending organisms include fungal 
pathogens Fusarium oxysporum, Botrytis cinerea, Sclerotinia sclerotiorum, and 
Erysiphe orontii. Bacterial pathogens to which resistance may be conferred include 
Pseudomonas syringae. Other problem organisms may potentially include 
nematodes, mollicutes, parasites, or herbivorous arthropods. In each case, one or 
more transformed transcription factor genes may provide some benefit to the plant to 
help prevent or overcome infestation. The mechanisms by which the transcription 
factors work could include increasing surface waxes or oils, surface thickness, local 
senescence, or the activation of signal transduction pathways that regulate plant 
defense in response to attacks by herbivorous pests (including, for example, protease 
inhibitors). 
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Increased tolerance of plants to nutrient-limited soils . Presently disclosed 
transcription factor genes introduced into plants may provide the means to improve 
uptake of essential nutrients, including nitrogenous compounds, phosphates, 
potassium, and trace minerals. The effect of these modifications is to increase the 
seedling germination and range of ornamental and crop plants. The utilities of 
presently disclosed transcription factor genes conferring tolerance to conditions of 
low nutrients also include cost savings to the grower by reducing the amounts of 
fertilizer needed, environmental benefits of reduced fertilizer runoff; and improved 
yield and stress tolerance. In addition, this gene could be used to alter seed protein 
amounts and/or composition that could impact yield as well as the nutritional value 
and production of various food products. 

Hormone sensitivity . One or more of the presently disclosed transcription 
factor genes have been shown to affect plant abscisic acid (ABA) sensitivity. This 
plant hormone is likely the most important hormone in mediating the adaptation of a 
plant to stress. For example, ABA mediates conversion of apical meristems into 
dormant buds. In response to increasingly cold conditions, the newly developing 
leaves growing above the meristem become converted into stiff bud scales that closely 
wrap the meristem and protect it from mechanical damage during winter. ABA in the 
bud also enforces dormancy; during premature warm spells, the buds are inhibited 
from sprouting. Bud dormancy is eliminated after either a prolonged cold period of 
cold or a significant number of lengthening days. Thus, by affecting ABA sensitivity, 
introduced transcription factor genes may affect cold sensitivity and survivability. 
ABA is also important in protecting plants from drought tolerance. 

Several other of the present transcription factor genes have been used to 
manipulate ethylene signal transduction and response pathways. These genes can thus 
be used to manipulate the processes influenced by ethylene, such as seed germination 
or fruit ripening, and to improve seed or fruit quality. 

Production of seed and leaf prenyl lipids, including tocopherol . Prenyl lipids 
play a role in anchoring proteins in membranes or membranous organelles. Thus 
modifying the prenyl lipid content of seeds and leaves could affect membrane 
integrity and function. A number of presently disclosed transcription factor genes 
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have been shown to modify the tocopherol composition of plants. Tocopherols have 
both anti-oxidant and vitamin E activity. 

Production of seed and leaf phvtosterols : Presently disclosed transcription 
factor genes that modify levels of phytosterols in plants may have at least two 
utilities. First, phytosterols are an important source of precursors for the manufacture 
of human steroid hormones. Thus, regulation of transcription factor expression or 
activity could lead to elevated levels of important human steroid precursors for steroid 
semi-synthesis. For example, transcription factors that cause elevated levels of 
campesterol in leaves, or sitosterols and stigmasterols in seed crops, would be useful 
for this purpose. Phytosterols and their hydrogenated derivatives phytostanols also 
have proven cholesterol-lowering properties, and transcription factor genes that 
modify the expression of these compounds in plants would thus provide health 
• benefits. 

Production of seed and leaf glucosinolates . Some glucosinolates have anti- 
cancer activity; thus, increasing the levels or composition of these compounds by 
introducing several of the presently disclosed transcription factors might be of interest 
from a nutraceutical standpoint. (3) Glucosinolates form part of a plants natural 
defense against insects. Modification of glucosinolate composition or quantity could 
therefore afford increased protection from predators. Furthermore, in edible crops, 
tissue specific promoters might be used to ensure that these compounds accumulate 
specifically in tissues, such as the epidermis, which are not taken for consumption. 

Modified seed oil content . The composition of seeds, particularly with respect 
to seed oil amounts and/or composition, is very important for the nutritional value and 
production of various food and feed products. Several of the presently disclosed 
transcription factor genes in seed lipid saturation that alter seed oil content could be 
used to improve the heat stability of oils or to improve the nutritional quality of seed 
oil, by, for example, reducing the number of calories in seed, increasing the number of 
calories in animal feeds, or altering the ratio of saturated to unsaturated lipids 
comprising the oils. 
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Seed and leaf fatty acid composition . A number of the presently disclosed 
transcription factor genes have been shown to alter the fatty acid composition in 
plants, and seeds in particular. This modification may find particular value for 
improving the nutritional value of, for example, seeds or whole plants. Dietary fatty 
acids ratios have been shown to have an effect on, for example, bone integrity and 
remodeling (see, for example, Weiler, H.A., PediatrRes (2000) 47:5 692-697). The 
ratio of dietary fatty acids may alter the precursor pools of long-chain polyunsaturated 
fatty acids that serve as precursors for prostaglandin synthesis. In mammalian 
connective tissue, prostaglandins serve as important signals regulating the balance 
between resorption and formation in bone and cartilage. Thus dietary fatty acid ratios 
altered in seeds may affect the etiology and outcome of bone loss. 

. Modified seed protein content . As with seed oils, the composition of seeds, 
particularly with respect to protein amounts and/or composition, is very important for 
the nutritional value and production of various food and feed products. A number of 
the presently disclosed transcription factor genes modify the protein concentrations in 
seeds would provide nutritional benefits, and may be used to prolong storage, increase 
seed pest or disease resistance, or modify germination rates. 

Production of flavonoids in leaves and other plant parts . Expression of 
presently disclosed transcription factor genes that increase flavonoid production in 
plants, including anthocyanins and condensed tannins, may be used to alter in pigment 
production for horticultural purposes, and possibly increasing stress resistance. 
Flavonoids have antimicrobial activity and could be used to engineer pathogen 
resistance. Several flavonoid compounds have health promoting effects such as the 
inhibition of tumor growth and cancer, prevention of bone loss and the prevention of 
the oxidation of lipids. Increasing levels of condensed tannins, whose biosynthetic 
pathway is shared with anthocyanin biosynthesis, in forage legumes is an important 
agronomic trait because they prevent pasture bloat by collapsing protein foams within 
the rumen. For a review on the utilities of flavonoids and their derivatives, refer to 
Dixon et al. (1999) Trends Plant Sci. 4:394-400. 

Production of diterpenes in leaves and ot her plant parts . Depending on the 
plant species, varying amounts of diverse secondary biochemicals (often lipophilic 
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terpenes) are produced and exuded or volatilized by trichomes. These exotic 
secondary bio chemicals, which are relatively easy to extract because they are on the 
surface of the leaf, have been widely used in such products as flavors and aromas, 
drugs, pesticides and cosmetics. Thus, the overexpression of genes that are used to 
produce diterpenes in plants may be accomplished by introducing transcription factor 
genes that induce said overexpression. One class of secondary metabolites, the 
diterpenes, can effect several biological systems such as tumor progression, 
prostaglandin synthesis and tissue inflammation. In addition, diterpenes can act as 
insect pheromones, termite allomones, and can exhibit neurotoxic, cytotoxic and 
antimitotic activities. As a result of this functional diversity, diterpenes have been the 
target of research several pharmaceutical ventures. In most cases where the metabolic 
pathways are impossible to engineer, increasing trichome density or size on leaves 
may be the only way to increase plant productivity. 

Production of anthocvanin in leaves and other plant parts . Several presently 
disclosed transcription factor genes can be used to alter anthocyanin production in 
numerous plant species. The potential utilities of these genes include alterations in 
pigment production for horticultural purposes, and possibly increasing stress 
resistance in combination with another transcription factor. 

Production of miscellaneous secondary metabolites . Microarray data suggests 
that flux through the aromatic amino acid biosynthetic pathways and primary and 
secondary metabolite biosynthetic pathways are up-regulated. Presently disclosed 
transcription factors have been shown to be involved in regulating alkaloid 
biosynthesis, in part by up-regulating the enzymes indole-3 -glycerol phosphatase and 
strictosidine synthase. Phenylalanine ammonia lyase, chalcone synthase and trans- 
cinnamate mono-oxygenase are also induced, and are involved in phenylpropenoid 
biosynthesis. 

Sugar, starch, hemicellulose composition . Overexpression of the presently 
disclosed transcription factors that affect sugar content resulted in plants with altered 
leaf insoluble sugar content. Transcription factors that alter plant cell wall 
composition have several potential applications including altering food digestibility, 
plant tensile strength, wood quality, pathogen resistance and in pulp production. The 
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potential utilities of a gene involved in glucose-specific sugar sensing are to alter 
energy balance, photosynthetic rate, carbohydrate accumulation, biomass production, 
source-sink relationships, and senescence. 

Hemicellulose is not desirable in paper pulps because of its lack of strength 
compared with cellulose. Thus modulating the amounts of cellulose vs. hemicellulose 
in the plant cell wall is desirable for the paper/lumber industry. Increasing the 
insoluble carbohydrate content in various fruits, vegetables, and other edible 
consumer products will result in enhanced fiber content. Increased fiber content 
would not only provide health benefits in food products, but might also increase 
digestibility of forage crops. In addition, the hemicellulose and pectin content of fruits 
and berries affects the quality of jam and catsup made from them. Changes in 
hemicellulose and pectin content could result in a superior consumer product. 

Plant response to sugars and sugar composition . In addition to their important 
role as an energy source and structural component of the plant cell, sugars are central 
regulatory molecules that control several aspects of plant physiology, metabolism and 
development. It is thought that this control is achieved by regulating gene expression 
and, in higher plants, sugars have been shown to repress or activate plant genes 
involved in many essential processes such as photosynthesis, glyoxylate metabolism, 
respiration, starch and sucrose synthesis and degradation, pathogen response, 
wounding response, cell cycle regulation, pigmentation, flowering and senescence. 
The mechanisms by which sugars control gene expression are not understood. 

Because sugars are important signaling molecules, the ability to control either 
the concentration of a signaling sugar or how the plant perceives or responds to a 
signaling sugar could be used to control plant development, physiology or 
metabolism. For example, the flux of sucrose (a disaccharide sugar used for 
systemically transporting carbon and energy in most plants) has been shown to affect 
gene expression and alter storage compound accumulation in seeds. Manipulation of 
the sucrose signaling pathway in seeds may therefore cause seeds to have more 
protein, oil or carbohydrate, depending on the type of manipulation. Similarly, in 
tubers, sucrose is converted to starch which is used as an energy store. It is thought 
that sugar signaling pathways may partially determine the levels of starch synthesized 
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in the tubers. The manipulation of sugar signaling in tubers could lead to tubers with a 
higher starch content 

Thus, the presently disclosed transcription factor genes that manipulate the 
sugar signal transduction pathway may lead to altered gene expression to produce 
plants with desirable traits. In particular, manipulation of sugar signal transduction 
pathways could be used to alter source-sink relationships in seeds, tubers, roots and 
other storage organs leading to increase in yield. 

Plant growth rate and development . A number of the presently disclosed 
transcription factor genes have been shown to have significant effects on plant growth 
rate and development. These observations have included, for example, more rapid or 
delayed growth and development of reproductive organs. This would provide utility 
for regions with short or long growing seasons, respectively. Accelerating plant 
: i growth would also improve early yield or increase biomass at an earlier stage, when 
such is desirable (for example, in producing forestry products). 

Embryo development . Presently disclosed transcription factor genes that alter 
embryo development has been used to alter seed protein and oil amounts and/or 
composition which is very important for the nutritional value and production of 
various food products. Seed shape and seed coat may also be altered by these genes, 
which may provide for improved storage stability. 

Seed germ ination rate. A number of the presently disclosed transcription 
factor genes have been shown to modify seed germination rate, including when the 
seeds are in conditions normally unfavorable for germination (e.g., cold, heat or salt 
stress, or in the presence of ABA), and may thus be used to modify and improve 
germination rates under adverse conditions. 

Plant, seedling vigor . Seedlings transformed with presently disclosed 
transcription factors have been shown to possess larger cotyledons and appeared 
somewhat more advanced than control plants. This indicates that the seedlings 
developed more rapidly that the control plants. Rapid seedling development is likely 
to reduce loss due to diseases particularly prevalent at the seedling stage (e.g., 
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damping off) and is thus important for survivability of plants germinating in the field 
or in controlled environments. 



Senescence, cell death . Presently disclosed transcription factor genes may be 
used to alter senescence responses in plants. Although leaf senescence is thought to be 
an evolutionary adaptation to recycle nutrients, the ability to control senescence in an 
agricultural setting has significant value. For example, a delay in leaf senescence in 
some maize hybrids is associated with a significant increase in yields and a delay of a 
few days in the senescence of soybean plants can have a large impact on yield. 
Delayed flower senescence may also generate plants that retain their blossoms longer 
and this may be of potential interest to the ornamental horticulture industry. 

Modified fertility . Plants that overexpress a number of the presently disclosed 
transcription factor genes have been shown to possess reduced fertility. This could 
be a desirable trait, as it could be exploited to prevent or minimize the escape of the 
pollen of genetically modified organisms (GMOs) into the environment. 

Early and delayed flowering . Presently disclosed transcription factor genes 
that accelerate flowering could have valuable applications in such programs since 
they allow much faster generation times. In a number of species, for example, 
broccoli, cauliflower, where the reproductive parts of the plants constitute the crop 
and the vegetative tissues are discarded, it would be advantageous to accelerate time 
to flowering. Accelerating flowering could shorten crop and tree breeding programs. 
Additionally, in some instances, a faster generation time might allow additional 
harvests of a crop to be made within a given growing season. A number of 
Arabidopsis genes have already been shown to accelerate flowering when 
constitutively expressed. These include LEAFY, APETALA1 and CONSTANS 
(Mandel, M. et al., 1995, Nature 377, 522-524; Weigel, D. and Nilsson, O., 1995, 
Nature 377, 495-500; Simon et al., 1996, Nature 384, 59-62). 

By regulating the expression of potential flowering using inducible promoters, 
flowering could be triggered by application of an inducer chemical. This would allow 
flowering to be synchronized across a crop and facilitate more efficient harvesting. 
Such inducible systems could also be used to tune the flowering of crop varieties to 
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different latitudes. At present, species such as soybean and cotton are available as a 
series of maturity groups that are suitable for different latitudes on the basis of their 
flowering time (which is governed by day-length). A system in which flowering could 
be chemically controlled would allow a single high-yielding northern maturity group 
to be grown at any latitude. In southern regions such plants could be grown for longer, 
thereby increasing yields, before flowering was induced. In more northern areas, the 
induction would be used to ensure that the crop flowers prior to the first winter frosts. 

In a sizeable number of species, for example, root crops, where the vegetative 
parts of the plants constitute the crop and the reproductive tissues are discarded, it 
would be advantageous to delay or prevent flowering. Extending vegetative 
development with presently disclosed transcription factor genes could thus bring 
about large increases in yields.. Prevention of flowering might help maximize 
vegetative yields and prevent escape of genetically modified organism (GMO) pollen. 

Extended flowering phase . Presently disclosed transcription factors that extend 
flowering time have utility in engineering plants with longer-lasting flowers for the 
horticulture industry, and for extending the time in which the plant is fertile. 

Flower and leaf development . Presently disclosed transcription factor genes 
have been used to modify the development of flowers and leaves. This could be 
advantageous in the development of new ornamental cultivars that present unique 
configurations. In addition, some of these genes have been shown to reduce a plant's 
fertility, which is also useful for helping to prevent development of pollen of GMOs. 

Flower abscission . Presently disclosed transcription factor genes introduced 
into plants have been used to retain flowers for longer periods. This would provide a 
significant benefit to the ornamental industry, for both cut flowers and woody plant 
varieties (of, for example, maize), as well as have the potential to lengthen the fertile 
period of a plant, which could positively impact yield and breeding programs. 
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A listing of specific effects and utilities that the presently disclosed 
transcription factor genes have on plants, as determined by direct observation and 
assay analysis, is provided in Table 4. 

XVI. Antisense and Co-suppression 

In addition to expression of the nucleic acids of the invention as gene 
replacement or plant phenotype modification nucleic acids, the nucleic acids are also 
useful for sense and anti-sense suppression of expression, e.g., to down-regulate 
•expression of a nucleic acid of the invention, e.g., as a further mechanism for 
modulating plant phenotype. That is, the nucleic acids of the invention, or 
subsequences or anti-sense sequences thereof, can be used to block expression of 
naturally occurring homologous nucleic acids. A variety of sense and anti-sense 
technologies are known in the art, e.g., as set forth in Lichtenstein and Nellen (1997) 
Antisense Technology: A Practical Approach IRL Press at Oxford University Press, 
Oxford, U.K.. In general, sense or anti-sense sequences are introduced into a cell, 
where they are optionally amplified, e.g., by transcription. Such sequences include 
both simple oligonucleotide sequences and catalytic sequences such as ribozymes. 

For example, a reduction or elimination of expression (i.e., a "knock-out") of a 
transcription factor or transcription factor homologue polypeptide in a transgenic 
plant, e.g., to modify a plant trait, can be obtained by introducing an antisense construct 
corresponding to the polypeptide of interest as a cDNA. For antisense suppression, the 
transcription factor or homologue cDNA is arranged in reverse orientation (with 
respect to the coding sequence) relative to the promoter sequence in the expression 
vector. The introduced sequence need not be the full length cDNA or gene, and need 
not be identical to the cDNA or gene found in the plant type to be transformed. 
Typically, the antisense sequence need only be capable of hybridizing to the target 
gene or RNA of interest. Thus, where the introduced sequence is of shorter length, a 
higher degree of homology to the endogenous transcription factor sequence will be 
needed for effective antisense suppression. While antisense sequences of various 
lengths can be utilized, preferably, the introduced antisense sequence in the vector 
will be at least 30 nucleotides in length, and improved antisense suppression will 
typically be observed as the length of the antisense sequence increases. Preferably, 
the length of the antisense sequence in the vector will be greater than 100 nucleotides. 
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Transcription of an antisense construct as described results in the production of RNA 
molecules that are the reverse complement of mRNA molecules transcribed from the 
endogenous transcription factor gene in the plant cell. 

Suppression of endogenous transcription factor gene expression can also be 
achieved using a ribozyme. Ribozymes are RNA molecules that possess highly 
specific endoribonuclease activity. The production and use of ribozymes are 
disclosed in U.S. Patent No. 4,987,071 and U.S. Patent No. 5,543,508. Synthetic 
ribozyme sequences including antisense RNAs can be used to confer RNA cleaving 
activity on the antisense RNA, such that endogenous mRNA molecules that hybridize 
to the antisense RNA are cleaved, which in turn leads to an enhanced antisense 
inhibition of endogenous gene expression. 

Suppression of endogenous transcription factor gene expression can also be 
achieved using RNA interference , or RNAi. RNAi is a post-transcriptional, targeted 
gene-silencing technique that uses double-stranded RNA (dsRNA) to incite 
degradation of messenger RNA (mRNA) containing the same sequence as the dsRNA 
(Constans, (2002,) The Scientist 16:36). Small interfering RNAs, or siRNAs are 
produced in at least two steps: an endogenous ribonuclease cleaves longer dsRNA 
into shorter, 21-23 nucleotide-long RNAs. The siRNA segments then mediate the 
degradation of the target mRNA (Zamore, (2001) Nature Struct. Biol, 8:746-50). 
RNAi has been used for gene function determination in a manner similar to antisense 
oligonucleotides (Constans, (2002) The Scientist 16:36). Expression vectors that 
continually express siRNAs in transiently and stably transfected have been engineered 
to express small hairpin RNAs (shRNAs), which get processed in vivo into siRNAs- 
like molecules capable of carrying out gene-specific silencing (Brummelkamp et al., 
(2002) Science 296:550-553, andPaddison, et al. (2002) Genes & Dev. 16:948-958). 
Post-transcriptional gene silencing by double-stranded RNA is discussed in further 
detail by Hammond et al. (2001) Nature Rev Gen 2: 1 10-119, Fire et al. (1998) Nature 
391: 806-81 1 and Timmons and Fire (1998) Nature 395: 854. 

Vectors in which RNA encoded by a transcription factor or transcription factor 
homologue cDNA is over-expressed can also be used to obtain co-suppression of a 
corresponding endogenous gene, e.g., in the manner described in U.S. Patent No. 
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5,231,020 to Jorgensen. Such co-suppression (also termed sense suppression) does 
not require that the entire transcription factor cDNA be introduced into the plant cells, 
nor does it require that the introduced sequence be exactly identical to the endogenous 
transcription factor gene of interest. However, as with antisense suppression, the 
suppressive efficiency will be enhanced as specificity of hybridization is increased, 
e.g., as the introduced sequence is lengthened, and/or as the sequence similarity 
between the introduced sequence and the endogenous transcription factor gene is 
increased. 

Vectors expressing an untranslatable form of the transcription factor mRNA, 
e.g., sequences comprising one or more stop codon, or nonsense mutation) can also be 
used to suppress expression of an endogenous transcription factor, thereby reducing or 
e limin ating it's activity and modifying one or more traits. Methods for producing 
such constructs are described in U.S. Patent No. 5,583,021. Preferably, such 
constructs are made by introducing a premature stop codon into the transcription 
factor gene. Alternatively, a plant trait can be modified by gene silencing using 
double-strand RNA (Sharp (1999) Genes and Development 13: 139-141).Another 
method for abolishing the expression of a gene is by insertion mutagenesis using the 
T-DNA of Agrobacterium tumefaciens. After generating the insertion mutants, the 
mutants can be screened to identify those containing the insertion in a transcription 
factor or transcription factor homologue gene. Plants containing a single transgene 
insertion event at the desired gene can be crossed to generate homozygous plants for 
the mutation. Such methods are well known to those of skill in the art. (See for 
example Koncz et al. (1992) Methods in Arabidopsis Research, World Scientific.) 

Alternatively, a plant phenotype can be altered by eliminating an endogenous 
gene, such as a transcription factor or transcription factor homologue, e.g., by 
homologous recombination (Kempin et al. (1997) Nature 389:802-803). 

A plant trait can also be modified by using the Cre-lox system (for example, as 
described in US Pat. No. 5,658,772). A plant genome can be modified to include 
first and second lox sites that are then contacted with a Cre recombinase. If the lox 
sites are in the same orientation, the intervening DNA sequence between the two sites 
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is excised. If the lox sites are in the opposite-orientation, the intervening sequence is 
inverted. 

The polynucleotides and polypeptides of this invention can also be expressed 
in a plant in the absence of an expression cassette by manipulating the activity or 
expression level of the endogenous gene by other means. For example, by 
ectopically expressing a gene by T-DNA activation tagging (Ichikawa et al. (1997) 
Nature 390 698-701; Kakimoto et al. (1996) Science 274: 982-985). This method 
entails transforming a plant with a gene tag containing multiple transcriptional 
enhancers and once the tag has inserted into the genome, expression of a flanking 
gene coding sequence becomes deregulated. In another example, the transcriptional 
machinery in a plant can be modified so as to increase transcription levels of a 
polynucleotide of the invention (See, e.g., PCT Publications WO 96/06166 and WO 
98/53057 which describe the modification of the DNA-binding specificity of zinc 
finger proteins by changing particular amino acids in the DNA-binding motif). 

The transgenic plant can also include the machinery necessary for expressing 
or altering the activity of a polypeptide encoded by an endogenous gene, for example 
by altering the phosphorylation state of the polypeptide to maintain it in an activated 
state. 

Transgenic plants (or plant cells, or plant explants, or plant tissues) 
incorporating the polynucleotides of the invention and/or expressing the polypeptides 
of the invention can be produced by a variety of well established techniques as 
described above. Following construction of a vector, most typically an expression 
cassette, including a polynucleotide, e.g., encoding a transcription factor or 
transcription factor homologue, of the invention, standard techniques can be used to 
introduce the polynucleotide into a plant, a plant cell, a plant explant or a plant tissue 
of interest. Optionally, the plant cell, explant or tissue can be regenerated to produce 
a transgenic plant. 

The plant can be any higher plant, including gymnosperms, 
monocotyledonous and dicotyledenous plants. Suitable protocols are available for 
Leguminosae (alfalfa, soybean, clover, etc.), Umbelliferae (carrot, celery, parsnip), 
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Cruciferae (cabbage, radish, rapeseed, broccoli, etc.), Curcurbitaceae (melons and 
cucumber), Gramineae (wheat, corn, rice, barley, millet, etc.), Solanaceae (potato, 
tomato, tobacco, peppers, etc.), and various other crops. See protocols described in 
Ammirato et al. (1984) Handbook of Plant Cell Culture -Crop Species, Macmillan 
Publ. Co. Shimamoto et al. (1989) Nature 338:274-276; Fromm et al. (1990) 
Bio/Technology 8:833-839; and Vasil et al (1990) Bio/Technology 8:429-434. 

Transformation and regeneration of both monocotyledonous and 
dicotyledonous plant cells is now routine, and the selection of the most appropriate 
transformation technique will be determined by the practitioner. The choice of 
method will vary with the type of plant to be transformed; those skilled in the art will 
recognize the suitability of particular methods for given plant types. Suitable methods 
can include, but are not limited to: electroporation of plant protoplasts; liposome- 
mediated transformation; polyethylene glycol (PEG) mediated transformation; 
transformation using viruses; micro-injection of plant cells; micro-projectile 
bombardment of plant cells; vacuum infiltration; and Agrobacterium tumefaciens 
mediated transformation. Transformation means introducing a nucleotide sequence 
into a plant in a manner to cause stable or transient expression of the sequence. 

Successful examples of the modification of plant characteristics by 
transformation with cloned sequences which serve to illustrate the current knowledge 
in this field of technology, and which are herein incorporated by reference, include: 
U.S. Patent Nos. 5,571,706; 5,677,175; 5,510,471; 5,750,386; 5,597,945; 5,589,615; 
5,750,871; 5,268,526; 5,780,708; 5,538,880; 5,773,269; 5,736,369 and 5,610,042. 

Following transformation, plants are preferably selected using a dominant 
selectable marker incorporated into the transformation vector. Typically, such a 
marker will confer antibiotic or herbicide resistance on the transformed plants, and 
selection of transformants can be accomplished by exposing the plants to appropriate 
concentrations of the antibiotic or herbicide. 

After transformed plants are selected and grown to maturity, those plants 
showing a modified trait are identified. The modified trait can be any of those traits 
described above. Additionally, to confirm that the modified trait is due to changes in 
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expression levels or activity of the polypeptide or polynucleotide of the invention can 
be determined by analyzing mRNA expression using Northern blots, RT-PCR or 
microarrays, or protein expression using immunoblots or Western blots or gel shift 
assays. 

XVII. Integrated Systems - Sequence Identity 

Additionally, the present invention may be an integrated system, computer or 
computer readable medium that comprises an instruction set for determining the 
identity of one or more sequences in a database. In addition, the instruction set can be 
used to generate or identify sequences that meet any specified criteria. Furthermore, 
the instruction set may be used to associate or link certain functional benefits, such 
improved characteristics, with one or more identified sequence. 

For example, the instruction set can include, e.g., a sequence comparison or 
other alignment program, e.g., an available program such as, for example, the 
Wisconsin Package Version 10.0, such as BLAST, FASTA, PILEUP, 
FINDPATTERNS or the like (GCG, Madison, WI): Public sequence databases such 
as GenBank, EMBL, Swiss-Prot and PIR or private sequence databases such as 
PHYTOSEQ sequence database (Incyte Genomics, Palo Alto, CA) can be searched. 

Alignment of sequences for comparison can be conducted by the local 
homology algorithm of Smith and Waterman (198 1) Adv. Appl. Math. 2:482, by the 
homology alignment algorithm of Needleman and Wunsch (1 970) J. Mol. Biol. 
48:443-453, by the search for similarity method of Pearson and Lipman (1988) Proc. 
Natl. Acad. Sci. U.S.A 85:2444-2448, by computerized implementations of these 
algorithms. After alignment, sequence comparisons between two (or more) 
polynucleotides or polypeptides are typically performed by comparing sequences of 
the two sequences over a comparison window to identify and compare local regions 
of sequence similarity. The comparison window can be a segment of at least about 20 
contiguous positions, usually about 50 to about 200, more usually about 100 to about 
150 contiguous positions. A description of the method is provided in Ausubel et al., 
supra. 
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A variety of methods for determining sequence relationships can be used, 
including manual alignment and computer assisted sequence alignment and analysis. 
This later approach is a preferred approach in the present invention, due to the 
increased throughput afforded by computer assisted methods. As noted above, a 
variety of computer programs for performing sequence alignment are available, or can 
be produced by one of skill. 

One example algorithm that is suitable for determining percent sequence 
identity and sequence similarity is the BLAST algorithm, which is described in 
Altschul et al. J. Mol. Biol 215:403-410 (1990). Software for performing BLAST 
analyses is publicly available, e.g., through the National Center for Biotechnology 
Information (see internet website at ncbi.nlm.nih.gov). This algorithm involves first 
identifying high scoring sequence pairs (HSPs) by identifying short words of length 
W in the query sequence, which either match or satisfy some positive-valued 
threshold score T when aligned with a word of the same length in a database 
sequence. T is referred to as the neighborhood word score threshold (Altschul et al., 
supra). These initial neighborhood word hits act as seeds for initiating searches to 
find longer HSPs containing them. The word hits are then extended in both directions 
along each sequence for as far as the cumulative alignment score can be increased. 
Cumulative scores are calculated using, for nucleotide sequences, the parameters M 
(reward score for a pair of matching residues; always > 0) and N (penalty score for 
mismatching residues; always < 0). For amino acid sequences, a scoring matrix is 
used to calculate the cumulative score. Extension of the word hits in each direction 
are halted when: the cumulative alignment score falls off by the quantity X from its 
maximum achieved value; the cumulative score goes to zero or below, due to the 
accumulation of one or more negative-scoring residue alignments; or the end of either 
sequence is reached. The BLAST algorithm parameters W, T, and X determine the 
sensitivity and speed of the alignment. The BLASTN program (for nucleotide 
sequences) uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, a cutoff 
of 100, M=5, N=-4, and a comparison of both strands. For amino acid sequences, the 
BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, 
and the BLOSUM62 scoring matrix {see Henikoff & Henikoff f!989 ^ Proc. Natl. 
Acad. Sci. USA 89:10915). Unless otherwise indicated, "sequence identity" here 
refers to the % sequence identity generated from a tblastx using the NCBI version of 
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the algorithm at the default settings using gapped alignments with the filter "off" (see, 
for example, internet website at ncbi.nlm.nih.gov). 

In addition to calculating percent sequence identity, the BLAST algorithm also 
performs a statistical analysis of the similarity between two sequences (see, e.g., 
Karlin & Altschul (1993^ Proc. Natl. Acad. Sci. USA 90:5873-5787). One measure 
of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), 
which provides an indication of the probability by which a match between two 
nucleotide or amino acid sequences would occur by chance. For example, a nucleic 
acid is considered similar to a reference sequence (and, therefore, in this context, 
homologous) if the smallest sum probability in a comparison of the test nucleic acid to 
the reference nucleic acid is less than about 0.1, or less than about 0.01, and or even 
less than about 0.001. An additional example of a useful sequence alignment 
algorithm is PILEUP. PILEUP creates a multiple sequence alignment from a group of 
related sequences using progressive, pairwise alignments. The program can align, e.g., 
up to 300 sequences of a maximum length of 5,000 letters. 

The integrated system, or computer typically includes a user input interface 
allowing a user to selectively view one or more sequence records corresponding to the 
one or more character strings, as well as an instruction set which aligns the one or 
more character strings with each other or with an additional character string to 
identify one or more region of sequence similarity. The system may include a link of 
one or more character strings with a particular phenotype or gene ftmction. Typically, 
the system includes a user readable output element that displays an alignment 
produced by the alignment instruction set. 

The methods of this invention can be implemented in a localized or distributed 
computing environment. In a distributed environment, the methods may implemented 
on a single computer comprising multiple processors or on a multiplicity of 
computers. The computers can be linked, e.g. through a common bus, but more 
preferably the computer(s) are nodes on a network. The network can be a generalized 
or a dedicated local or wide-area network and, in certain preferred embodiments, the 
computers may be components of an intra-net or an internet. 
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Thus, the invention provides methods for identifying a sequence similar or 
homologous to one or more polynucleotides as noted herein, or one or more target 
polypeptides encoded by the polynucleotides, or otherwise noted herein and may 
include linking or associating a given plant phenotype or gene function with a 
sequence. In the methods, a sequence database is provided (locally or across an inter 
or intra net) and a query is made against the sequence database using the relevant 
sequences herein and associated plant phenotypes or gene functions. 

V 

Any sequence herein can be entered into the database, before or after querying 
the database. This provides for both expansion of the database, and, if done before the 
querying step, for insertion of control sequences into the database. The control 
sequences can be detected by the query to ensure the general integrity of both the 
database and the query. As noted, the query can be performed using a web browser 
based interface. For example, the database can be a centralized public database such 
as those noted herein, and the querying can be done from a remote terminal or 
computer across an internet or intranet. 

XVHL Examples 

The following examples are intended to illustrate but not limit the present 
invention. The complete descriptions of the traits associated with each polynucleotide 
of the invention is fully disclosed in Table 4 and Table 6. 

Example I: Full Length Gene Identification and Cloning 

Putative transcription factor sequences (genomic or ESTs) related to known 
transcription factors were identified in the Arabidopsis thaliana GenBank database 
using the tblastn sequence analysis program using default parameters and a P-value 
cutoff threshold of --4 or -5 or lower, depending on the length of the query sequence. 
Putative transcription factor sequence hits were then screened to identify those 
containing particular sequence strings. If the sequence hits contained such sequence 
strings, the sequences were confirmed as transcription factors. 

Alternatively, Arabidopsis thaliana cDNA libraries derived from different 
tissues or treatments, or genomic libraries were screened to identify novel members of 
a transcription family using a low stringency hybridization approach. Probes were 
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synthesized using gene specific primers in a standard PCR reaction (annealing 
temperature 60° C) and labeled with 32 P dCTP using the High Prime DNA Labeling 
Kit (Boehringer Mannheim). Purified radiolabeled probes were added to filters 
immersed in Church hybridization medium (0.5 M NaP0 4 pH 7.0, 7% SDS, 1 % w/v 
bovine serum albumin) and hybridized overnight at 60°C with shaking. Filters were 
washed two times for 45 to 60 minutes with lxSCC, 1% SDS at 60° C. 

To identify additional sequence 5' or 3 ! of a partial cDNA sequence in a cDNA 
library, 5' and 3' rapid amplification of cDNA ends (RACE) was performed using the 
U.C. Marathon cDNA amplification kit (Clontech, Palo Alto, CA). Generally, the 
method entailed first isolating poly(A) mRNA, perfonning first and second strand 
cDNA synthesis to generate double stranded cDNA, blunting cDNA ends, followed 
by ligation of the U.C. Marathon Adaptor to the cDNA to form a library of adaptor- 
ligated ds cDNA. 

Gene-specific primers were designed to be used along with adaptor specific 
primers for both 5' and 3' RACE reactions. Nested primers, rather than single 
primers, were used to increase PCR specificity. Using 5' and 3' RACE reactions, 5' 
and 3' RACE fragments were obtained, sequenced and cloned. The process can be 
repeated until 5' and 3' ends of the full-length gene were identified. Then the full- 
length cDNA was generated by PCR using primers specific to 5' and 3' ends of the 
gene by end-to-end PCR. 

Example BE: Construction of Expression Vectors 

The sequence was amplified from a genomic or cDNA library using primers 
specific to sequences upstream and downstream of the coding region. The expression 
vector was pMEN20 or pMEN65, which are both derived from pMON316 (Sanders et 
al, (1987 ) Nucleic Acids Research 15:1543-1558) and contain the CaMV 35S 
promoter to express transgenes. To clone the sequence into the vector, both pMEN20 
and the amplified DNA fragment were digested separately with Sail and NotI 
restriction enzymes at 37° C for 2 hours. The digestion products were subject to 
electrophoresis in a 0.8% agarose gel and visualized by ethidium bromide staining. 
The DNA fragments containing the sequence and the linearized plasmid were excised 
and purified by using a Qiaquick gel extraction kit (Qiagen, Valencia CA). The 
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fragments of interest were ligated at a ratio of 3:1 (vector to insert). Ligation 
reactions using T4 DNA ligase (New England Biolabs, Beverly MA) were carried out 
at 16° C for 16 hours. The ligated DNAs were transformed into competent cells of the 
E. coli strain DH5 alpha by using the heat shock method. The transformations were 
plated on LB plates containing 50 mg/1 kanamycin (Sigma, St. Louis, MO). 
Individual colonies were grown overnight in five milliliters of LB broth containing 50 
mg/1 kanamycin at 37° C. Plasmid DNA was purified by using Qiaquick Mini Prep 
kits (Qiagen). 

Example HI: Transformation of Agrobacterium with the Expression Vector 

After the plasmid vector containing the gene was constructed, the vector was 
used to transform Agrobacterium tumefaciens cells expressing the gene products. The 
stock of Agrobacterium tumefaciens cells for transformation were made as described 
by Nagel et al. (1990) FEMS Microbiol Letts . 67: 325-328. Agrobacterium strain 
ABI was grown in 250 ml LB medium (Sigma) overnight at 28°C with shaking until 
an absorbance (A 6 oo) of 0.5 - 1 .0 was reached. Cells were harvested by centrifugation 
at 4,000 x g for 15 min at 4° C. Cells were then resuspended in 250 jiil chilled buffer 
(1 mM HEPES, pH adjusted to 7.0 with KOH). Cells were centrifuged again as 
described above and resuspended in 125 jxl chilled buffer. Cells were then 
centrifuged and resuspended two more times in the same HEPES buffer as described 
above at a volume of 100 \il and 750 respectively. Resuspended cells were then 
distributed into 40 p.1 aliquots, quickly frozen in liquid nitrogen, and stored at -80° C. 

Agrobacterium cells were transformed with plasmids prepared as described 
above following the protocol described by Nagel et al. For each DNA construct to be 
transformed, 50-100 ng DNA (generally resuspended in 10 mM Tris-HCl, 1 mM 
EDTA, pH 8.0) was mixed with 40 \a\ of Agrobacterium cells. The DNA/cell mixture 
was then transferred to a chilled cuvette with a 2mm electrode gap and subject to a 2.5 
kV charge dissipated at 25 \kB and 200 \iF using a Gene Pulser II apparatus (Bio-Rad, 
Hercules, CA). After electroporation, cells were immediately resuspended in 1.0 ml 
LB and allowed to recover without antibiotic selection for 2 — 4 hours at 28° C in a 
shaking incubator. After recovery, cells were plated onto selective medium of LB 
broth containing 100 fig/ml spectinomycin (Sigma) and incubated for 24-48 hours at 
28° C. Single colonies were then picked and inoculated in fresh medium. The 
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presence of the plasmid construct was verified by PCR amplification and sequence 
analysis. 



Example IV: Transformation of Arabidopsis Plants with Agrobacterium 
tumefaciens with Expression Vector 

After transformation of Agrobacterium tumefaciens with plasmid vectors 
containing the gene, single Agrobacterium colonies were identified, propagated, and 
used to transform Arabidopsis plants. Briefly, 500 ml cultures of LB medium 
containing 50 mg/1 kanamycin were inoculated with the colonies and grown at 28° C 
with shaking for 2 days until an optical absorbance at 600 nm wavelength over 1 cm 
(A 6 oo) of > 2.0 is reached. Cells were then harvested by centrifugation at 4,000 x g for 
10 min, and resuspended in infiltration medium (1/2 X Murashige and Skoog salts 
(Sigma), 1 X Gamborg's B-5 vitamins (Sigma), 5.0% (w/v) sucrose (Sigma), 0.044 
|iM benzylamino purine (Sigma), 200 nl/LSilwet L-77 (Lehle Seeds) until an A 60 o of 
0.8 was reached. 

Prior to transformation, Arabidopsis thaliana seeds (ecotype Columbia) were 
sown at a density of -10 plants per 4" pot onto Pro-Mix BX potting medium 
(Hummert International) covered with fiberglass mesh (18 mm X 16 mm). Plants 
were grown under continuous illumination (50-75 p.E/m 2 /sec) at 22-23° C with 65- 
70% relative humidity. After about 4 weeks, primary inflorescence stems (bolts) are 
cut off to encourage growth of multiple secondary bolts. After flowering of the 
mature secondary bolts, plants were prepared for transformation by removal of all 
siliques and opened flowers. 

The pots were then immersed upside down in the mixture of Agrobacterium 
infiltration medium as described above for 30 sec, and placed on their sides to allow 
draining into a 1' x T flat surface covered with plastic wrap. After 24 h, the plastic 
wrap was removed and pots are turned upright. The immersion procedure was 
repeated one week later, for a total of two immersions per pot. Seeds were then 
collected from each transformation pot and analyzed following the protocol described 
below. 
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Example V: Identification of Arabidopsis Primary Transformants 

Seeds collected from the transformation pots were sterilized essentially as 
follows. Seeds were dispersed into in a solution containing 0.1% (v/v) Triton X-100 
(Sigma) and sterile H2O and washed by shaking the suspension for 20 min. The wash 
solution was then drained and replaced with fresh wash solution to wash the seeds for 
20 min with shaking. After removal of the second wash solution, a solution 
containing 0.1% (v/v) Triton X-100 and 70% ethanol (Equistar) was added to the 
seeds and the suspension was shaken for 5 min. After removal of the 
ethanol/detergent solution, a solution containing 0.1% (v/v) Triton X-100 and 30% 
(v/v) bleach (Clorox) was added to the seeds, and the suspension was shaken for 10 
min. After removal of the bleach/detergent solution, seeds were then washed five 
times in sterile distilled H 2 0. The seeds were stored in the last wash water at 4° C for 
2 days in the dark before being plated onto antibiotic selection medium (1 X 
Murashige and Skoog salts (pH adjusted to 5.7 with 1M KOH), 1 X Gamborg's B-5 
vitamins, 0,9% phytagar (Life Technologies), and 50 mg/1 kanamycin). Seeds were 
germinated under continuous illumination (50-75 |LiE/m 2 /sec) at 22-23° C. After 7-10 
days of growth under these conditions, kanamycin resistant primary transformants (Ti 
generation) were visible and obtained. These seedlings were transferred first to fresh 
selection plates where the seedlings continued to grow for 3-5 more days, and then to 
soil (Pro-Mix BX potting medium). 

Primary transformants were crossed and progeny seeds (T 2 ) collected; 
kanamycin resistant seedlings were selected and analyzed. The expression levels of 
the recombinant polynucleotides in the transformants varies from about a 5% 
expression level increase to a least a 100% expression level increase. Similar 
observations are made with respect to polypeptide level expression. 

Example VI: Identification of Arabidopsis Plants with Transcription Factor Gene 
Knockouts 

The screening of insertion mutagenized Arabidopsis collections for null 
mutants in a known target gene was essentially as described in Krysan et al (1999) 
Plant Cell 1 1 :2283-2290. Briefly, gene-specific primers, nested by 5-250 base pairs 
to each other, were designed from the 5' and 3' regions of a known target gene. 
Similarly, nested sets of primers were also created specific to each of the T-DNA or 
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transposon ends (the "right" and "left" borders). All possible combinations of gene 
specific and T-DNA/transposon primers were used to detect by PCR an insertion 
event within or close to the target gene. The amplified DNA fragments were then 
sequenced which allows the precise determination of the T-DNA/transposon insertion 
point relative to the target gene. Insertion events within the coding or intervening 
sequence of the genes were deconvoluted from a pool comprising a plurality of 
insertion events to a single unique mutant plant for functional characterization. The 
method is described in more detail in Yu and Adam, US Application Serial No. 
09/177,733 filed October 23, 1998. 

Example VII: Identification of Modified Phenotypes in Overexpression or Gene 
Knockout Plants 

Experiments were performed to identify those transformants or knockouts that 
exhibited modified biochemical characteristics. Among the biochemicals that were 
assayed were insoluble sugars, such as arabinose, fiicose, galactose, mannose, 
rhamnose or xylose or the like; prenyl lipids, such as lutein, beta-carotene, 
xanthophylM, xanthophyll-2, chlorophylls A or B, or alpha-, delta- or gamma- 
tocopherol or the like; fatty acids, such as 16:0 (palmitic acid), 16:1 (palmitoleic 
acid), 18:0 (stearic acid), 18:1 (oleic acid), 18:2 (linoleic acid), 20:0 , 18:3 (linolenic 
acid), 20:1 (eicosenoic acid), 20:2, 22:1 (erucic acid) or the like; waxes, such as by 
altering the levels of C29, C31, or C33 alkanes; sterols, such as brassicasterol, 
campesterol, stigmasterol, sitosterol or stigmastanol or the like, glucosinolates, 
protein or oil levels. 

Fatty acids were measured using two methods depending on whether the tissue 
was from leaves or seeds. For leaves, lipids were extracted and esterified with hot 
methanolic H2SO4 and partitioned into hexane from methanolic brine. For seed fatty 
acids, seeds were pulverized and extracted in methanol :heptane:toluene:2,2- 
dimethoxypropane:H 2 S0 4 (39:34:20:5:2) for 90 minutes at 80°C. After cooling to 
room temperature the upper phase, containing the seed fatty acid esters, was subjected 
to GC analysis. Fatty acid esters from both seed and leaf tissues were analyzed with a 
Supelco SP-2330 column. 
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Glucosinolates were purified from seeds or leaves by first heating the tissue at 
95°C for 10 minutes. Preheated ethanoliwater (50:50) is and after heating at 95°C for 
a further 10 minutes, the extraction solvent is applied to a DEAE Sephadex column 
which had been previously equilibrated with 0.5 M pyridine acetate. 
Desulfoglucosinolates were eluted with 300 ul water and analyzed by reverse phase 
HPLC monitoring at 226 nm. 

For wax alkanes, samples were extracted using an identical method as fatty 
acids and extracts were analyzed on a HP 5890 GC coupled with a 5973 MSD. 
Samples were chromatographically isolated on a J&W DB35 mass spectrometer 
(J&W Scientific). 

To measure prenyl lipids levels, seeds or leaves were pulverized with 1 to 2% 
pyrogallol as an antioxidant. For seeds, extracted samples were filtered and a portion 
removed for tocopherol and carotenoid/chlorophyll analysis by HPLC. The 
remaining material was saponified for sterol determination. For leaves, an aliquot 
was removed and diluted with methanol and chlorophyll A, chlorophyll B, and total 
carotenoids measured by spectrophotometry by determining optical absorbance at 
665.2 nm, 652.5 nm, and 470 nm. An aliquot was removed for tocopherol and 
carotenoid/chlorophyll composition by HPLC using a Waters uBondapak CI 8 column 
(4.6 mm x 150 mm). The remaining methanolic solution was saponified with 10% 
KOH at 80°C for one hour. The samples were cooled and diluted with a mixture of 
methanol and water. A solution of 2% methylene chloride in hexane was mixed in 
and the samples were centrifuged. The aqueous methanol phase was again re- 
extracted 2% methylene chloride in hexane and, after centrifiigation, the two upper 
phases were combined and evaporated. 2% methylene chloride in hexane was added 
to the tubes and the samples were then extracted with one ml of water. The upper 
phase was removed, dried, and resuspended in 400 ul of 2% methylene chloride in 
hexane and analyzed by gas chromatography using a 50 m DB-5ms (0.25 mm ID, 
0.25 urn phase, J&W Scientific). 

Insoluble sugar levels were measured by the method essentially described by 
Reiter et al., (1997) Plant Journal 12:335-345. This method analyzes the neutral sugar 
composition of cell wall polymers found in Arabidopsis leaves. Soluble sugars were 
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separated from sugar polymers by extracting leaves with hot 70% ethanol. The 
remaining residue containing the insoluble polysaccharides was then acid hydrolyzed 
with allose added as an internal standard. Sugar monomers generated by the 
hydrolysis were then reduced to the corresponding alditols by treatment with NaBH4, 
then were acetylated to generate the volatile alditol acetates which were then analyzed 
by GC-FH). Identity of the peaks was determined by comparing the retention times 
of known sugars converted to the corresponding alditol acetates with the retention 
times of peaks from wild-type plant extracts. Alditol acetates were analyzed on a 
Supelco SP-2330 capillary column (30 m x 250 urn x 0.2 urn) using a temperature 
program beginning at 180° C for 2 minutes followed by an increase to 220° C in 4 
minu tes. After holding at 220° C for 10 minutes, the oven temperature is increased to 
240° C in 2 minutes and held at this temperature for 10 minutes and brought back to 
room temperature. 

To identify plants with alterations in total seed oil or protein content, 150mg 
of seeds from T2 progeny plants were subjected to analysis by Near Infrared 
Reflectance Spectroscopy (NIRS) using a Foss NirSystems Model 6500 with a 
spinning cup transport system. NIRS is a non-destructive analytical method used to 
determine seed oil and protein composition. Infrared is the region of the 
electromagnetic spectrum located after the visible region in the direction of longer 
wavelengths. 'Near infrared' owns its name for being the infrared region near to the 
visible region of the electromagnetic spectrum. For practical purposes, near infrared 
comprises wavelengths between 800 and 2500 nm. NIRS is applied to organic 
compounds rich in OH bonds (such as moisture, carbohydrates, and fats), C-H bonds 
(such as organic compounds and petroleum derivatives), and N-H bonds (such as 
proteins and amino acids). The NIRS analytical instruments operate by statistically 
correlating NIRS signals at several wavelengths with the characteristic or property 
intended to be measured. All biological substances contain thousands of C-H, O-H, 
and N-H bonds. Therefore, the exposure to near infrared radiation of a biological 
sample, such as a seed, results in a complex spectrum which contains qualitative and 
quantitative information about the physical and chemical composition of that sample. 

The numerical value of a specific analyte in the sample, such as protein 
content or oil content, is mediated by a calibration approach known as chemometrics. 
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Chemometrics applies statistical methods such as multiple linear regression (MLR), 
partial least squares (PLS), and principle component analysis (PCA) to the spectral 
data and correlates them with a physical property or other factor, that property or 
factor is directly determined rather than the analyte concentration itself. The method 
first provides "wet chemistry" data of the samples required to develop the calibration. 

Calibration for Arabidopsis seed oil composition was performed using 
accelerated solvent extraction using 1 g seed sample size and was validated against 
certified canola seed. A similar wet chemistry approach was performed for seed 
protein composition calibration. 

Data obtained from NIRS analysis was analyzed statistically using a nearest- 
neighbor (N-N) analysis. The N-N analysis allows removal of within-block spatial 
variability in a fairly flexible fashion which does not require prior knowledge of the 
pattern of variability in the chamber. Ideally, all hybrids are grown under identical 
experimental conditions within a block (rep). In reality, even in many block designs, 
significant within-block variability exists. Nearest-neighbor procedures are based on 
assumption that environmental effect of a plot is closely related to that of its 
neighbors. Nearest-neighbor methods use information from adjacent plots to adjust 
for within-block heterogeneity and so provide more precise estimates of treatment 
means and differences. If there is within-plot heterogeneity on a spatial scale that is 
larger than a single plot and smaller than the entire block, then yields from adjacent 
plots will be positively correlated. Information from neighboring plots can be used, to 
reduce or remove the unwanted effect of the spatial heterogeneity, and hence improve 
the estimate of the treatment effect. Data from neighboring plots can also be used to 
reduce the influence of competition between adjacent plots. The Papadakis N-N 
analysis can be used with designs to remove within-block variability that would not 
be removed with the standard split plot analysis (Papadakis, 1973, Inst. d'Amelior. 
Plantes Thessaloniki (Greece) Bull. Scientif., No. 23; Papadakis, 1984, Proc. Acad. 
Athens, 59, 326-342). 

Experiments were performed to identify those transformants or knockouts that 
exhibited an improved pathogen tolerance. For such studies, the transformants were 
exposed to biotropic fungal pathogens, such as Erysiphe orontii, and necrotropic 
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fungal pathogens, such as Fusarium oxysporwn. Fusarium oxysporum isolates cause 
vascular wilts and damping off of various annual vegetables, perennials and weeds 
(Mauch-Mani and Slusarenko (1994) Molecular Plant-Microbe Interactions 7: 378- 
383). Vox Fusarium oxysporum experiments, plants grown on Petri dishes were 
sprayed with a fresh spore suspension of F. oxysporum. The spore suspension was 
prepared as follows: A plug of fungal hyphae from a plate culture was placed on a 
fresh potato dextrose agar plate and allowed to spread for one week. 5 ml sterile 
water was then added to the plate, swirled, and pipetted into 50 ml Armstrong 
Fusarium medium. Spores were grown overnight in Fusarium medium and then 
sprayed onto plants using a Preval paint sprayer. Plant tissue was harvested and 
frozen in liquid nitrogen 48 hours post infection. 

Erysiphe orontii is a causal agent of powdery mildew. For Erysiphe orontii 
experiments, plants were grown approximately 4 weeks in a greenhouse under 12 
hour light (20°C, -30% relative humidity (rh)). Individual leaves were infected with 
E. orontii spores from infected plants using a cameFs hair brush, and the plants were 
transferred to a Percival growth chamber (20°C, 80% rh.). Plant tissue was harvested 
and frozen in liquid nitrogen 7 days post infection. 

Botrytis cinerea is a necrotrophic pathogen. Botrytis cinerea was grown on 
potato dextrose agar in the light. A spore culture was made by spreading 10 ml of 
sterile water on the fungus plate, swirling and transferring spores to 10 ml of sterile 
water. The spore inoculum (approx. 105 spores/ml) was used to spray 10 day-old 
seedlings grown under sterile conditions on MS (minus sucrose) media. Symptoms 
were evaluated every day up to approximately 1 week. 

Infection with bacterial pathogens Pseudomonas syringae pv maculicola (Psm) 
strain 4326 and pv maculicola strain 4326 was performed by hand inoculation at two 
doses. Two inoculation doses allows the differentiation between plants with enhanced 
susceptibility and plants with enhanced resistance to the pathogen. Plants were grown 
for 3 weeks in the greenhouse, then transferred to the growth chamber for the 
remainder of their growth. Psm ES4326 was hand inoculated with 1 ml syringe on 3 
fully-expanded leaves per plant (4 1/2 wk old), using at least 9 plants per 
overexpressing line at two inoculation doses, OD=0.005 and OD=0.0005. Disease 
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scoring occurred at day 3 post-inoculation with pictures of the plants and leaves taken 
in parallel. 



In some instances, expression patterns of the pathogen-induced genes (such as 
defense genes) was monitored by microarray experiments. cDNAs were generated by 
PCR and resuspended at a final concentration of- 100 ng/ul in 3X SSC or 150mM 
Na-phosphate (Eisen and Brown (1999) Methods Enzymol 303:179-205). The 
cDNAs were spotted on microscope glass slides coated with polylysine. The prepared 
cDNAs were aliquoted into 384 well plates and spotted on the slides using an x-y-z 
gantry (OmniGrid) purchased from GeneMachines (Menlo Park, CA) outfitted with . 
quill type pins purchased from Telechem International (Sunnyvale, CA). After 
spotting, the arrays were cured for a minimum of one week at room temperature, 
rehydrated and blocked following the protocol recommended by Eisen and Brown 
(1999; supra). 

Sample total RNA (10 ug) samples were labeled using fluorescent Cy3 and 
Cy5 dyes. Labeled samples were resuspended in 4X SSC/0.03% SDS/4 ug salmon 
sperm DNA/2 ug tRNA/ 50mM Na-pyrophosphate, heated for 95°C for 2.5 minutes, 
spun down and placed on the array. The array was then covered with a glass 
coverslip and placed in a sealed chamber. The chamber was then kept in a water bath 
at 62°C overnight. The arrays were washed as described in Eisen and Brown (1999) 
and scanned on a General Scanning 3000 laser scanner. The resulting files are 
subsequently quantified using Imagene, a software purchased from BioDiscovery 
(Los Angeles, CA). 

Experiments were performed to identify those transformants or knockouts that 
exhibited an improved environmental stress tolerance. For such studies, the 
transformants were exposed to a variety of environmental stresses. Plants were 
exposed to chilling stress (6 hour exposure to 4-8° C ), heat stress (6 hour exposure to 
32-37° C), high salt stress (6 hour exposure to 200 mM NaCl), drought stress (168 
hours after removing water from trays), osmotic stress (6 hour exposure to 3 M 
mannitol), or nutrient limitation (nitrogen, phosphate, and potassium) (Nitrogen: all 
components of MS medium remained constant except N was reduced to 20 mg/1 of 
NH4NO3, or Phosphate: All components of MS medium except KH2PO4, which was 



139 



WO 03/013227 PCT/US02/25805 

replaced by K 2 S0 4 , Potassium: All components of MS medium except removal of 
KNO3 and KH2PO4, which were replaced by NalLjPO^. 

Experiments were performed to identify those transfonnants or knockouts that 
exhibited a modified structure and development characteristics. For such studies, the 
transformants were observed by eye to identify novel structural or developmental 
characteristics associated with the ectopic expression of the polynucleotides or 
polypeptides of the invention. 

Experiments were performed to identify those transformants or knockouts that 
exhibited modified sugar-sensing. For such studies, seeds from transformants were 
germinated on media containing 5% glucose or 9.4% sucrose which normally partially 
restrict hypocotyl elongation. Plants with altered sugar sensing may have either 
longer or shorter hypocotyls than normal plants when grown on this media. 
Additionally, other plant traits may be varied such as root mass. 

Flowering time was measured by the number of rosette leaves present when a 
visible inflorescence of approximately 3 cm is apparent Rosette and total leaf number 
on the progeny stem are tightly correlated with the timing of flowering (Koornneef et 
al (1991) Mol Gen. Genet 229:57-66. The vernalization response was measured. For 
vernalization treatments, seeds were sown to MS agar plates, sealed with micropore 
tape, and placed in a 4°C cold room with low light levels for 6-8 weeks. The plates 
were then transferred to the growth rooms alongside plates containing freshly sown 
non-vernalized controls. Rosette leaves were counted when a visible inflorescence of 
approximately 3 cm was apparent. 

Modified phenotypes observed for particular overexpressor or knockout plants 
are provided in Table 4. For a particular overexpressor that shows a less beneficial 
characteristic, it may be more useful to select a plant with a decreased expression of 
the particular transcription factor. For a particular knockout that shows a less 
beneficial characteristic, it may be more useful to select a plant with an increased 
expression of the particular transcription factor. 
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The sequences of the Sequence Listing or those in Tables 4 , 5 or those 
disclosed here can be used to prepare transgenic plants and plants with altered traits. 
The specific transgenic plants listed below are produced from the sequences of the 
Sequence Listing, as noted. Table 4 provides exemplary polynucleotide and 
polypeptide sequences of the invention. Table 4 includes, from left to right for each 
sequence: the first column shows the polynucleotide SEQ ID NO; the second column 
shows the Mendel Gene ID No., GID; the third column shows the trait(s) resulting 
from the knock out or overexpression of the polynucleotide in the transgenic plant; 
the fourth column shows the category of the trait; the fifth column shows the 
transcription factor family to which the polynucleotide belongs; the sixth column 
("Comment"), includes specific effects and utilities conferred by the polynucleotide 
of the first column; the seventh column shows the SEQ ID NO of the polypeptide 
encoded by the polynucleotide; and the eighth column shows the amino acid residue 
positions of the conserved domain in amino acid (AA) co-ordinates. 

Seed of plants overexpressing sequences G265 (SEQ ID NOs:871 and 872), 
G715 (SEQ ID NOs:925 and 926), G1471 (SEQ ID NOs:311 and 312), G1793 (SEQ 
ID NOs:365 and 366), G1838 (SEQ ID NOs:381 and 382), G1902 (SEQ ID NOs:405 
and 406), G286 (SEQ ID NOs:877 and 878), G2138 (SEQ ID NOs:865 and 866) and 
G2830 (SEQ ID NOs:875 and 876) was subjected to NIR analysis and a significant 
increase in seed oil content compared with seed from control plants was identified. 

G192: G192 (SEQ ID NO: 859) was expressed in all plant tissues and under 
all conditions examined. Its expression was slightly induced upon infection by 
Fusarium. G192 was analyzed using transgenic plants in which this gene was 
expressed under the control of the 35S promoter. G192 overexpressors were late 
flowering under 12 hour light and had more leaves than control plants. This 
phenotype was manifested in the three T2 lines analyzed. Results of one experiment 
suggest that G192 overexpressor was more susceptible to infection with a moderate 
dose of the fungal pathogen Erysiphe orontii. The decrease in seed oil observed for 
one line was replicated in an independent experiment. G192 overexpression delayed 
flowering. A wide variety of applications exist for systems that either lengthen or 
shorten the time to flowering, or for systems of inducible flowering time control. In 
particular, in species where the vegetative parts of the plants constitute the crop and 
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the reproductive tissues are discarded, it will be advantageous to delay or prevent 
flowering. Extending vegetative development can bring about large increases in 
yields. G192 can be used to manipulate the defense response in order to generate 
pathogen-resistant plants. G192 can be used to manipulate seed oil content, which 
can be of nutritional value. 

Closely Related Genes from Other Species 

G192 had some similarity within the conserved WRKY domain to non- 
Arabidopsis plant proteins. 

G1946: G1946 (SEQ ID NO: 801) was studied using transgenic plants in 
which the gene was expressed under the control of the 35S promoter. 
Overexpression of G1946 resulted in accelerated flowering, with 35S::G1946 
transformants producing flower buds up to a week earlier than wild-type controls (24- 
hour light conditions). These effects were seen in 12/20 primary transformants and in 
two independent plantings of each of the three T2 lines. Unlike many early flowering 
Arabidopsis transgenic lines, which are dwarfed, 35S::G1946 transformants often 
reached full-size at maturity, and produced large quantities of seeds, although the 
plants were slightly pale in coloration and had slightly flat leaves compared to wild- 
type. In addition, 35S::G1946 plants showed an altered response to phosphate 
deprivation. Seedlings of G1946 overexpressor plants showed more secondary root 
growth on phosphate-free media, when compared to wild-type control. In a repeat 
experiment, all three lines showed the phenotype. Overexpression of G1946 in 
Arabidopsis also resulted in an increase in seed glucosinolate M39501 in T2 lines 
land 3. An increase in seed oil and a decrease in seed protein was also observed in 
these two lines. G1946 was ubiquitously expressed, and does not appear to be 
significantly induced or repressed by any of the biotic and abiotic stress conditions 
tested at this time, with the exception of cold, which repressed G1946 expression. 
G1946 can be used to modify flowering time, as well as to improve the plant's 
performance in conditions of limited phosphate, and to alter seed oil, protein, and 
glucosinolate composition. 
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A comparison of the amino acid sequence of G1946 with sequences available 
from GenBank showed strong similarity with plant HSFs of several species 
(Lycopersicon peruvianum, Medicago truncatula, Lycopersicon esculentum, Glycine 
max, Solanum tuberosum, Oryza sativa and Hordeum vulgare subsp. vulgare). 

G375: The sequence of G375 (SEQ ID NO:239) was experimentally 
determined and G375 was analyzed using transgenic plants in which G375 was 
expressed under the control of the 35S promoter. Overexpression of G375 produced 
marked effects on leaf development. At early stages of growth, 35S::G375 seedlings 
developed narrow, upward pointing leaves with long petioles (possibly indicating a 
disruption in circadian-clock controlled processes or nyctinastic movements). 
Additionally, some seedlings were noted to have elongated hypocotyls, and some 
were rather small compared to wild-type controls. Comparable phenotypes were 
obtained by overexpression of an AP2 family gene, G21 13 (SEQ ID NO: 85). 
Following the switch to flowering, 35S::G375 plants showed reduced fertility, which 
possibly arose from a failure of stamens to fully elongate. One of the three T2 lines, 
(#41) was later flowering than wild-type controls, and also developed large numbers 
of small secondary rosette leaves in the axils of the primary rosette. Although these 
effects were not noted in the other two lines, the phenotypes obtained in line 41 were 
somewhat similar to those produced by overexpression of another Z-dof gene, G736 
(SEQ ID NO: 211). G375 was expressed in all tissues, although at different levels. It 
was expressed at low levels in the root and germinating seed, and expressed at high 
levels in the embryo. The effects of G375 on leaf architecture are of potential interest 
to the ornamental horticulture industry. 

Closely Related Genes from Other Species 

G375 showed some homology to non-Arabidopsis plant proteins within the 
conserved Dof domain. 

G1255: The sequence of G1255 (SEQ ID NO: 273) was experimentally 
determined and G1255 was analyzed using transgenic plants in which G1255 was 
expressed under the control of the 35S promoter. Plants overexpressing G1255 had 
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alterations in leaf architecture, a reduction in apical dominance, an increase in seed 
size, and showed more disease symptoms following inoculation with a low dose of the 
fungal pathogen Botrytis cinerea. G1255 was constitutively expressed and not 
significantly induced by any conditions tested. On the basis of the phenotypes 
produced by overexpression of G1255, G1255 can be used to manipulate the plant's 
defense response to produce pathogen resistance, alter plant architecture, or alter seed 
size. 

Closely Related Genes from Other Species 

G1255 showed strong homology to a putative rice zing finger protein 
represented by sequence AC0871 81_3. Sequence identity between these two protein 
extended beyond the conserved domain, and therefore, these genes can be orthologs. 

G865: The complete cDNA sequence of G865 (SEQ ID NO: 557) was 
determined. G865 was ubiquitously expressed in Arabidopsis tissues. G865 was 
analyzed using transgenic plants in which G865 was expressed under the control of 
the 35S promoter. Plants overexpressing G865 were early flowering, with numerous 
secondary inflorescence meristems giving them a bushy appearance. G865 
overexpressors were more susceptible to infection with a moderate dose of the fungal 
pathogens Erysiphe orontii and Botrytis cinerea. In addition, seeds from G865 
overexpressing plants showed a trend of increased protein and reduced oil content, 
although the observed changes were not beyond the criteria used forjudging 
significance except in one line. G865 can be used to control flowering time. G865 
can be used to manipulate the defense response in order to generate pathogen-resistant 
plants. G865 can be used to alter seed oil and protein content of a plant. 

Closely Related Genes from Other Species 

G865 and other non- Arabidopsis AP2/EREBP proteins were similar within the 
conserved AP2 domain. 

G2509: G2509 (SEQ ID NO: 23) was studied using transgenic plants in 
which the gene was expressed under the control of the 35S promoter. Overexpression 
of G2509 caused multiple alterations in plant growth and development, most notably, 
altered branching patterns, and a reduction in apical dominance, giving the plants a 
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shorter, more bushy stature than wild type. Twenty 35S::G2509 primary 
transformants were examined; at early stages of rosette development, these plants 
displayed a wild-type phenotype. However, at the switch to flowering, almost all Tl 
lines showed a marked loss of apical dominance and large numbers of secondary 
shoots developed from axils of primary rosette leaves. In the most extreme cases, the 
shoots had very short internodes, giving the inflorescence a very bushy appearance. 
Such shoots were often very thin and flowers were relatively small and poorly fertile. 
At later stages, many plants appeared very small and had a low seed yield compared 
to wild type. In addition to the effects on branching, a substantial number of 
35S::G2509 primary transformants also flowered early and had buds visible several 
days prior to wild type. Similar effects on inflorescence development were noted in 
each of three T2 populations examined. The branching and plant architecture 
phenotypes observed in 35S::G2509 lines resemble phenotypes observed for three 
other AP2/EREBP genes: G865 (SEQ ID NO: 557), G141 1 (SEQ ID NO: 3), and 
G1794 (SEQ ID NO: 13). G2509, G865, and G1411 form a small clade within the 
large AP2/EREBP family, and G1794, although not belonging to the clade, is one of 
the AP2/EREBP genes closest to it in the phylogenetic tree. It is thus likely that all . 
these genes share a related function, such as affecting hormone balance. 
Overexpression of G2509 in Arabidopsis resulted in an increase in alpha-tocopherol 
in seeds in T2 lines 5 and 11. G2509 was ubiquitously expressed in Arabidopsis plant 
tissue. G2509 expression levels were altered by a variety of environmental or 
physiological conditions. G2509 can be used to manipulate plant architecture and 
development. G2509 can be used to alter tocopherol composition. Tocopherols have 
anti-oxidant and vitamin E activity. G2509 can be useful in altering flowering time. 
A wide variety of applications exist for systems that either lengthen or shorten the 
time to flowering. 

Closely Related Genes from Other Species 

G2509 showed some sequence similarity with known genes from other plant 
species within the conserved AP2/EREBP domain. 

G2347: G2347 (SEQ ID NO: 1 1 19) was analyzed using transgenic plants in 
which G2347 was expressed under the control of the 35S promoter. Overexpression 
of G2347 markedly reduced the time to flowering in Arabidopsis. This phenotype 
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was apparent in the majority of primary transform ants and in all plants from two out 
of the three T2 lines examined. Under continuous light conditions, 35S::G2347 plants 
formed flower buds up a week earlier than wild type. Many of the plants were rather 
small and spindly compared to controls. To demonstrate that overexpression of 
G2347 could induce flowering under less inductive photoperiods, two T2 lines were 
re-grown in 12 hour conditions; again, all plants from both lines bolted early, with 
some initiating flower buds up to two weeks sooner than wild-type. As determined by 
RT-PCR, G2347 was highly expressed in rosette leaves and flowers, and to much 
lower levels in embryos and siliques. No expression of G2347 was detected in the 
other tissues tested. G2347 expression was repressed by cold, and by auxin 
treatments and by infection by Erysiphe. G2347 is also highly similar to the 
Arabidopsis protein G2010 (SEQ ID NO: 1 121). The level of homology between 
these two proteins suggested they could have similar, overlapping, or redundant 
fonctions in Arabidopsis. In support of this hypothesis, overexpression of both G2010 
and G2347 resulted in early flowering phenotypes in transgenic plants. 

Closely Related Genes from Other Species 

The closest relative to G2347 is the Antirrhinum protein, SBP2 (CAA63061). 
The similarity between these two proteins is extensive enough to suggest they might 
have similar functions in a plant. 

G988: G988 (SEQ ID NO: 43) was analyzed using transgenic plants in which 
G988 was expressed under the control of the 35S promoter. Plants overexpressing 
G988 had multiple morphological phenotypes. The transgenic plants were generally 
smaller than wild-type plants, had altered leaf, inflorescence and flower development, 
altered plant architecture, and altered vasculature. In one transgenic line 
overexpressing G988 (line 23), an increase in the seed glucosinolate M39489 was 
observed. The phenotype of plants overexpressing G988 was wild-type in all other 
assays performed. In wild-type plants, G988 was expressed primarily in flower and 
silique tissue, but was also present at detectable levels in all other tissues tested. 
Expression of G988 was induced in response to heat treatment, and repressed in 
response to infection with Erysiphe. Based on the observed morphological 
phenotypes of the transgenic plants, G988 can be used to create plants with larger 
flowers. This can have value in the ornamental horticulture industry. The reduction 
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in the formation of lateral branches suggests that G988 can have utility on the forestry 
industry. The Arabidopsis plants overexpressing G988 also had reduced fertility. 
This can be a desirable trait in some instances, as it can be exploited to prevent or 
minimize the escape of GMO (genetically modified organism) pollen into the 
environment. 

Closely Related Genes from Other Species 

The amino acid sequence for the Capsella rubella hypothetical protein 
represented by GenBank accession number CRU303349 was significantly identical to 
G988 outside of the SCR conserved domains. The Capsella rubella hypothetical 
protein is 90% identical to G988 over a stretch of roughly 450 amino acids. 
Therefore, it is likely that the Capsella rubella gene is an ortholog of G988. 

G2346: G2346 (SEQ ID NO: 459) was analyzed using transgenic plants in 
which the gene was expressed under the control of the 35S promoter. 35S::G2346 
seedlings from all three T2 populations had slightly larger cotyledons and appeared 
somewhat more advanced than controls. This indicated that the seedlings developed 
more rapidly that the control plants. At later stages, however, G2346 overexpressing 
plants showed no consistent differences from control plants. The phenotype of these 
transgenic plants was wild-type in all other assays performed. According to RT-PCR 
analysis, G2346 is expressed ubiquitously. 

Closely Related Genes from Other Species 

G2346 shows some sequence similarity with known genes from other plant 
species within the conserved SBP domain. 

G1354: The complete sequence of G1354 (SEQ ID NO: 285) was determined. 
G1354 was analyzed using transgenic plants in which G1354 was expressed under the 
control of the 35S promoter. Overexpression of G1354 produced highly deleterious 
effects on growth and development. Only three 35S::G1354 Tl plants were obtained; 
all were extremely tiny and slow developing. After three weeks of growth, each of 
the plants comprised a completely disorganized mass of leaves and root that had no 
clear axis of growth. Since these individuals would not have survived transplantation 
to soil, they were harvested for RT-PCR analysis; all three plants showed moderate 
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levels of G1354 overexpression compared to whole wild-type seedlings of an 
equivalent size. Only a very small number of transform ants were obtained from two 
selection attempts on separate batches of TO seed. Usually between 15 and 120 
transformants are obtained from each aliquot of 300 mg TO seed from wild-type 
plants. The low transformation frequency obtained in this experiment suggests that 
high levels of G1354 overexpression might have completely lethal effects and prevent 
transformed seeds from germinating. As determined by RT-PCR, G1354 was 
uniformly expressed in all tissues and under all conditions tested in RT-PCR. 
However, the gene was repressed in leaf tissue in response to Erysiphe infection. 

Closely Related Genes from Other Species 

G1354 is closely related to a NAM protein encoded by polynucleotide from 
rice (AC0053 10). Similarity between G1354 and this rice protein extends beyond the 
signature motif of the family to a level that would suggest the genes are orthologs. 

G1063: G1063 (SEQ ID NO: 1 19) is a member of a clade of highly related 
HLH/MYC proteins that also includes G779 (SEQ ID NO: 113), G1499 (SEQ ID NO: 
7), G2143 (SEQ ID NO: 129), and G2557 (SEQ ID NO: 133). All of these genes 
caused similar pleiotropic phenotypic effects when overexpressed, the most striking 
of which was the production of ectopic carpelloid tissue. These genes can be 
considered key regulators of carpel development. A spectrum of developmental 
alterations was observed amongst 35S::G1063 primary transformants and the majority 
were markedly small, dark green, and had narrow curled leaves. The most severely 
affected individuals were completely sterile and formed highly abnormal 
inflorescences; shoots often terminated in pin-like structures, and flowers were 
replaced by filamentous carpelloid structures. In other cases, flowers showed 
internode elongation between floral whorls, with a central carpel protruding on a 
pedicel-like organ. Additionally, lateral branches sometimes failed to develop and 
tiny patches of carpelloid tissue formed at axillary nodes of the inflorescence. In lines 
with an intermediate phenotype, flowers contained defined whorls of organs, but 
sepals were converted to carpelloid structures or displayed patches of carpelloid 
tissue. In contrast, lines with a weak phenotype developed relatively normal flowers 
and produced a reasonable quantity of seed. Such plants were still distinctly smaller 
than wild-type controls. Since the strongest 35S::G1063 lines were sterile, three lines 
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with a relatively weak phenotype, that had produced sufficient seed for biochemical 
and physiological analysis, were selected for further study. Two of the T2 
populations (T2-28,37) were clearly small, darker green and possessed narrow leaves 
compared to wild type. Plants from one of these populations (T2-28) also produced 
occasional branches with abnormal flowers like those seen in the Tl . The final T2 
population (T2-30) displayed a very mild phenotype. Overexpression of G1063 in 
Arabidopsis resulted in a decrease in seed oil content in T2 lines 28 and 37. No 
altered phenotypes were detected in any of the physiological assays, except that the 
plants were noted to be somewhat small and produce anthocyanin when grown in 
Petri plates. G1063 was expressed at low to moderate levels in roots, flowers, rosette 
leaves, embryos, and germinating seeds, but was not detected in shoots or siliques. It 
was induced by auxin. G1063 can be used to manipulate flower form and structure or 
plant fertility. One application for manipulation of flower structure can be in the 
production of saffron, which is derived from the stigmas of Crocus sativus. G1063 
has utility in manipulating seed oil and protein content. 

Closely Related Genes from Other Species 

G1063 protein shared extensive homology in the basic helix loop helix region 
with a protein sequence encoded by Glycine max cDNA clone (AW832545) as well 
as a tomato root, plants pre-anthesis Lycopersicon sculentum cDNA (BE451174). 

G2143: G2143 (SEQ ID NO: 129) is amember of a clade of highly related 
HLH/MYC proteins that also includes G779 (SEQ ID NO: 1 13), G1063 (SEQ ID NO: 
119), G1499 (SEQ ID NO: 7), and G2557 (SEQ ID NO: 133). Ail of these genes 
caused similar pleiotropic phenotypic effects when overexpressed, the most striking 
of which was the production of ectopic carpelloid tissue. These genes can be 
.considered key regulators of carpel development. Twelve out of twenty 35S::G2143 
Tl lines showed a very severe phenotype; these plants were markedly small and had 
narrow, curled, dark-green leaves. Such individuals were completely sterile and 
formed highly abnormal inflorescences; shoots often terminated in pin-like structures, 
and flowers were replaced by filamentous carpelloid structures, or a fused mass of 
carpelloid tissue. Furthermore, lateral branches usually failed to develop, and tiny 
patches of stigmatic tissue often formed at axillary nodes of the inflorescence. 
Strongly affected plants displayed the highest levels of transgene expression 
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(determined by RT-PCR). The remaining Tl lines showed lower levels of G2143 
overexpression; these plants were still distinctly smaller than wild type, but had 
relatively normal inflorescences and produced seed. Since the strongest 35S::G2143 
lines were sterile, three lines with a relatively weak phenotype, that had produced 
sufficient seed for biochemical analysis, were selected for further study. T2-1 1 plants 
displayed a very mild phenotype and had somewhat small, narrow, dark green leaves. 
The other two T2 populations, however, appeared wild-type, suggesting that 
transgene activity might have been reduced between the generations. Reduced 
seedling vigor was noted in the physiological assays. G2143 expression was detected ■ 
at low levels in flowers and siliques, and at higher levels in germinating seed G2143 
can be used to manipulate flower form and structure or plant fertility. One application 
for manipulation of flower structure can be in the production of saffron, which is 
derived from the stigmas of Crocus sativus. 

Closely Related Genes from Other Species 

G2143 protein shared extensive homology in the basic helix loop helix region 
with a protein encoded by Glycine max cDNA clones (AW832545, BG726819 and 
BG1 54493) and a Lycopersicon esculentum cDNA clone (BE45 1 174). There was 
lower homology outside of the region. 

G2557: G2557 (SEQ ID NO: 133) is a member of a clade of highly related 
HLH/MYC proteins that also includes G779 (SEQ ID NO: 113), G1063 (SEQ ID NO: 
119), G1499 (SEQ ID NO: 7), and G2143 (SEQ ID NO: 129). All of these genes 
caused similar pleiotropic phenotypic effects when overexpressed, the most striking 
of which was the production of ectopic carpelloid tissue. These genes can be 
considered key regulators of carpel development. The flowers of 35S::G2557 primary 
transformants displayed patches of stigmatic papillae on the sepals, and often had 
rather narrow petals and poorly developed stamens. Additionally, carpels were also 
occasionally held outside of the flower at the end of an elongated pedicel like 
structure. As a result of such defects, 35S::G2557 plants often showed very poor 
fertility and formed small wrinkled siliques. In addition to such floral abnormalities, 
the majority of primary transformants were also small and darker green in coloration 
than wild type. Approximately one third of the Tl plants were extremely tiny and 
completely sterile. Three Tl lines (#7,9,12), that had produced some seeds, and 
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showed a relatively weak phenotype, were chosen for further study. All three of the 
T2 populations from these lines contained plants that were distinctly small, had 
abnormal flowers, and were poorly fertile compared to controls. Stigmatic tissue was 
not noted on the sepals of plants from these three T2 lines. Another line (#4) that had 
shown a moderately strong phenotype in the Tl was sown for only morphological 
analysis in the T2 generation. T2-4 plants were small, dark green, and produced 
abnormal flowers with ectopic stigmatic tissue on the sepals, as had been seen in the 
parental plant. G2557 expression was detected at low to moderate levels in all tissues 
tested except shoots. It was induced by cold, heat, and salt, and repressed by 
pathogen infection. G1063 can be used to manipulate flower form and structure or 
plant fertility. One application for manipulation of flower structure can be in the 
production of saffron, which is derived from the stigmas of Crocus sativus. 

Closely Related Genes from Other Species 

G2557 protein shows extensive sequence similarity in the region of basic helix 
loop helix with a protein encoded by Glycine max cDNA clone (BE34781 1). 

G2430: The complete sequence of G2430 (SEQ ID NO: 697) was 
determined. G2430 is a member of the response regulator class of GARP proteins 
(ARR genes), although one of the two conserved aspartate residues characteristic of 
response regulators is not present. The second aspartate, the putative phosphorylated 
site, is retained so G2430 can have response regulator function. G2430 is specifically 
expressed in embryo and silique tissue. In morphological analyses, plants 
overexpressing G2430 showed more rapid growth than control plants at early stages, 
and in two of three lines examined produced large, flat leaves. Early flowering was 
observed for some lines, but this effect was inconsistent between plantings. G2430 
can regulate plant growth. Overexpression of G2430 in Arabidopsis also resulted in 
seedlings that are slightly more tolerant to heat in a germination assay. Seedlings 
from G2430 overexpressing transgenic plants were slightly greener than the control 
seedlings under high temperature conditions. In a repeat experiment on individual 
lines, G2430 line 15 showed the strongest heat tolerant phenotype. G2430 can be 
useful to promote faster development and reproduction in plants. 
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Closely Related Genes from Other Species 

G2430 had some similarity within of the conserved GAJRP and response- 
regulator domains to non-Arabidopsis proteins. 

G1478: The sequence of G1478 (SEQ ID NO: 831) was determined and 
G1478 was analyzed using transgenic plants in which G1478 was expressed under the 
control of the 35S promoter. Plants overexpressing G1478 had a general delay in 
progression through the life cycle, in particular a delay in flowering time. G1478 is 
expressed at higher levels in flowers, rosettes and embryos but otherwise expression 
is constitutive. Based on the phenotypes produced through G1478 overexpression, 
G1478 can be used to manipulate the rate at which plants grow, and flowering time. 

Closely Related Genes from Other Species 

G1478 shows some homology to non-Arabidopsis proteins within the 
conserved domain. 

G681 : G681 (SEQ ID NO: 579) was analyzed using transgenic plants in 
which the gene was expressed under the control of the 35S promoter. Approximately 
half of the 35S::G681 primary transformants were markedly small and formed narrow 
leaves compared to controls. These plants often produced thin inflorescence stems, 
had rather poorly formed flowers with low pollen production, and set few seeds. 
Three Tl lines with relatively weak phenotypes, which had produced reasonable 
quantities of seed, were selected for further study. Plants from one of the T2 
populations were noted to be slightly small, but otherwise the T2 lines displayed no 
consistent differences in morphology from controls. In leaves of two of the T2 lines, 
overexpression of G681 resulted in an increase in the percentage of the glucosinolate 
M39480. According to RT-PCR analysis, G681 expression was detected at very low 
levels in flower and rosette leaf tissues. G681 was induced by drought stress. G681 
can be used to alter glucosinolate composition in plants. Increases or decreases in 
specific glucosinolates or total glucosinolate content are desirable depending upon the 
particular application. For example: (1) Glucosinolates are undesirable components 
of the oilseeds used in animal feed, since they produce toxic effects. Low- 
glucosinolate varieties of canola have been developed to combat this problem. (2) 
Some glucosinolates have anti-cancer activity; thus, increasing the levels or 
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composition of these compounds might be of interest from a nutraceutical standpoint. 
(3) Glucosinolates form part of a plants natural defense against insects. Modification 
of glucosinolate composition or quantity could therefore afford increased protection 
from predators. Furthermore, in edible crops, tissue specific promoters can be used to 
ensure that these compounds accumulate specifically in tissues, such as the epidermis, 
which are not taken for consumption. 

Closely Related Genes from Other Species 

G681 shows some sequence similarity with known genes from other plant 
species within the conserved Myb doma i n. 

G878: G878 (SEQ ID NO: 61 1) was studied using transgenic plants in which 
the gene was expreissed under the control of the 35S promoter. Analysis of primary 
transformants revealed that overexpression of G878 delays the: onset of flowering in. \< 
Arabidopsis. 1 1/20 of the 35S: :G878 Tl plants flowered approximately one week 
later than wild type under continuous light conditions. These plants were also darker 
green, had shorter stems, and senesced later than controls. G878 was ubiquitously 
expressed. G878 can be used to modify flowering time and senescence, and a wide 
variety of applications exist for systems that either lengthen or shorten the time to 
flowering. 

Closely Related Genes from Other Species 

G878 was highly related to other WRKY proteins from a variety of plant 
species, such as the Nicotiana tabacum DNA-binding protein 2 (WRKY2) 
(AF096299), and a Cucumis sativus SPFl-like DNA-binding protein (L44134). 

G374: G374 (SEQ ID NO: 47) was expressed at low levels throughout the 
plant and was induced by salicylic acid. G374 was investigated using lines carrying a 
T-DNA insertion in this gene. The T-DNA insertion was approximately three 
quarters of the way into the protein coding sequence and should result in a null 
mutation. Homozygosity for a T-DNA insertion within G374 caused lethality at early 
stages of embryo development. In an initial screen for G374 knockouts, heterozygous 
plants were identified. Seed from those individuals was sown to soil and eleven 
plants were PCR-screened to identify homozygotes. No homozygotes were obtained; 
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6 of the progeny were heterozygous whilst the other 5 were wild type. This raised the 
prospect that homozygosity for the G374 insertion was lethal. To examine this 
possibility further, heterozygous KO.G374 plants were re-grown. These individuals 
looked wild type, but their siliques were examined for seed abnormalities. When 
green siliques were dissected, around 25% of developing seeds were white or aborted. 
Embryos from these siliques were cleared using Hoyers solution, and examined under 
the microscope. It was apparent that embryos from the white seeds had arrested at 
early (globular or heart) stages of development, whilst embryos from the normal seeds 
were frilly developed. Such arrested or aborted seeds most likely represented 
homozygotes for the G374 insertion. To support this conclusion, seed was collected 
from heterozygous plants and sown to kanamycin plates (the T-DNA insertion carried 
the NPT marker gene); Of the seedlings that germinated, 160 were kanamycin 
resistant and 107 were kanamycin sensitive. These data more closely fitted a 2:1 (chi- 
sq., ldf, = 5.5, 0.05>P>0.01) than a 3:1 (chi-sq., ldf, = 32, P<0.001) ratio. Such a 
segregation ratio suggested that a homozygous class of kanamycin resistant seedlings 
was absent from the progeny of KO.G374 plant. G374 can be a herbicide target. 

Closely Related Genes from Other Species 

Similar sequences to G374 are present in tomato and Medicago truncatula, and 
these sequences can be orthologs. 

Example VIII: Identification of Homologous Sequences 

Homologous sequences from Arabidopsis and plant species other than 
Arabidopsis were identified using database sequence search tools, such as the Basic 
Local Alignment Search Tool (BLAST) (Altschul et al. (1990) J. Mol. Biol. 215:403- 
410; and Altschul et al. (1997) Nucl. Acid Res. 25: 3389-3402). The tblastx sequence 
analysis programs were employed using the BLOSUM-62 scoring matrix (Henikoff, 
S. and Henikoff, J. G. (1992) Proc. Natl. Acad. Sci. USA 89: 10915-10919). 

Identified non- Arabidopsis sequences homologous to the Arabidopsis 
sequences are provided in Table 5. The percent sequence identity among these 
sequences can be as low as 47%, or even lower sequence identity. The entire NCBI 
GenBank database was filtered for sequences from all plants except Arabidopsis 
thaliana by selecting all entries in the NCBI GenBank database associated with NCBI 
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taxonomic ID 33090 (Viridiplantae; all plants) and excluding entries associated with 
taxonomic ID 3701 {Arabidopsis thaliana). These sequences are compared to 
sequences representing genes of SEQ IDs NOs:2 - 2N, where N = 2-561 , using the 
Washington University TBLASTX algorithm (version 2.0al9MP) at the default 
settings using gapped alignments with the filter "off \ For each gene of SEQ IDs 
NOs:2 - 2N, where N = 2-561, individual comparisons were ordered by probability 
score (P- value), where the score reflects the probability that a particular alignment 
occurred by chance. For example, a score of 3.6e-40 is 3.6 x 10" 40 . In addition to P- 
values, comparisons were also scored by percentage identity. Percentage identity 
reflects the degree to which two segments of DNA or protein are identical over a 
particular length. Examples of sequences so identified are presented in Table 5. 
Homologous or orthologous sequences are readily identified and available in 
GenBank by Accession number (Table 5; Test sequence ID). The identified 
homologous polynucleotide and polypeptide sequences and homologues of the 
Arabidopsis polynucleotides and polypeptides may be orthologs of the Arabidopsis 
polynucleotides and polypeptides (TBD: to be determined). 

Example IX Introduction of polynucleotides into dicotyledonous plants 

SEQ ID NOs:l-(2N - 1), wherein N = 2-561, paralogous, orthologous, and 
homologous sequences recombined into pMEN20 or pMEN65 expression vectors are 
transformed into a plant for the purpose of modifying plant traits. The cloning vector 
may be introduced into a variety of cereal plants by means well-known in the art such 
as, for example, direct DNA transfer or Agrobacterium tumefaciens-mediated 
transformation. It is now routine to produce transgenic plants using most dicot plants 
(see Weissbach and Weissbach, (1989; supra; Gelvin et aL, (1990) supra; Herrera- 
Estrella et al. (1983) supra; Bevan (1984) supra; and Klee (1985) supra). Methods 
for analysis of traits are routine in the art and examples are disclosed above. 

Example X Transformation of Cereal Plants with an Expression Vector 

Cereal plants such as corn, wheat, rice, sorghum or barley, may also be 
transformed with the present polynucleotide sequences in pMEN20 or pMEN65 
expression vectors for the purpose of modifying plant traits. For example, pMEN020 
may be modified to replace the Nptn coding region with the BAR gene of 
Streptomyces hygroscopicus that confers resistance to phosphinothricin. The Kpnl 
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and BgUI sites of the Bar gene are removed by site-directed mutagenesis with silent 
codon changes. 



The cloning vector may be introduced into a variety of cereal plants by means 
well-known in the art such as, for example, direct DNA transfer or Agrobacterium 
tumefaciens-mediated transformation. It is now routine to produce transgenic plants 
of most cereal crops (Vasil, L, Plant Molec. Biol. 25: 925-937 (1994)) such as com, 
wheat, rice, sorghum (Cassas, A. et al., Proc. Natl. Acad Sci USA 90: 11212-11216 
(1993) and barley (Wan, Y. and Lemeaux, P. Plant Physiol. 104:37-48 (1994). DNA 
transfer methods such as the microprojectile can be used for corn (Fromm. et al. 
Bio/Technology 8: 833-839 (1990); Gordon-Kamm et al. Plant Cell 2: 603-618 
(1990); Ishida, Y., Nature Biotechnology 14:745-750 (1990)), wheat (Vasil, et al. 
Bio/Technology 10:667-674 (1992) ; Vasil et al., Bio/Technology 11:1553-1558 
(1993); Weeks et al., Plant Physiol. 102:1077-1084 (1993)), rice (Christou 
Bio/Technology 9:957-962 (1991); Hiei et al. Plant J. 6:271-282 (1994); Aldemita 
and Hodges, Planta 199:612-617; Hiei et al., Plant Mol Biol. 35:205-18 (1997)). For 
most cereal plants, embryogenic cells derived from immature scutellum tissues are the 
preferred cellular targets for transformation (Hiei et al., Plant Mol Biol. 35:205-18 
(1997); Vasil, Plant Molec. Biol. 25: 925-937 (1994)). 

Vectors according to the present invention may be transformed into corn 
embryogenic cells derived from immature scutellar tissue by using microprojectile 
bombardment, with the A188XB73 genotype as the preferred genotype (Fromm, et 
al., Bio/Technology 8: 833-839 (1990); Gordon-Kamm et al., Plant Cell 2: 603-618 
(1990)). After microprojectile bombardment the tissues are selected on 
phosphinothricin to identify the transgenic embryogenic cells (Gordon-Kamm et al., 
Plant Cell 2: 603-618 (1990)). Transgenic plants are regenerated by standard corn 
regeneration techniques (Fromm, et al., Bio/Technology 8: 833-839 (1990); Gordon- 
Kamm et aL, Plant Cell 2: 603-618 (1990)). 

The plasmids prepared as described above can also be used to produce 
transgenic wheat and rice plants (Christou, Bio/Technology 9:957-962 (1991); Hiei et 
al., Plant J. 6:271-282 (1994); Aldemita and Hodges, Planta 199:612-617 (1996); 
Hiei et al., Plant Mol Biol. 35:205-18 (1997)) that coordinately express genes of 



156 



WO 03/013227 PCT/US02/25805 

interest by following standard transformation protocols known to those skilled in the 
art for rice and wheat Vasil, et al. Bio/Technology 10:667-674 (1992) ; Vasil et al., 
Bio/Technology 11:1553-1558(1993); Weeks et al., Plant Physiol. 102:1077-1084 
(1993)), where the bar gene is used as the selectable marker. 

All references, publications, patent documents, web pages, and other 
documents cited or mentioned herein are hereby incorporated by reference in their 
entirety for all purposes. Although the invention has been described with reference to 
specific embodiments and examples, it should be understood that one of ordinary skill 
can make various modifications without departing from the spirit of the invention. 
The scope of the invention is not limited to the specific embodiments and examples 
provided. 
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1. A transgenic plant comprising a recombinant polynucleotide having a 
nucleotide sequence selected from the group consisting of: 

(a) a nucleotide sequence encoding a polypeptide comprising a sequence selected 
from those of SEQ ID NOs: 860, 802, 240, 274, 558, 24, 1120, 44, 460, 286, 120, 
130, 134, 698, 832, 580, 612, and 48, or a complementary nucleotide sequence 
thereof; 

(b) a nucleotide sequence of SEQ ID NOs: 859, 801, 239, 273, 557, 23, 1 1 19, 43, 459, 
285, 119, 129, 133, 697, 831, 579, 611, 47, or a complementary nucleotide sequence 
thereof; and 

(c) a nucleotide sequence that hybridizes under stringent conditions to a nucleotide 
sequence of one or more polynucleotides of: (a) or (b). 

2. The transgenic plant of claim 1 wherein the transgenic plant possesses an 
altered trait as compared to another plant, or the transgenic plant exhibits an altered 
phenotype as compared to another plant, or the transgenic plant expresses an altered 
level of one or more genes associated with a plant trait as compared to another plant, 
wherein the other plant does not comprise the recombinant polynucleotide. 

3. The transgenic plant of claim 1 wherein the plant possesses an altered trait as 
compared to another plant wherein the trait is an alteration in one or more physical 
characteristics selected from the group consisting of: the number of trichomes, fruit 
and seed size and number, yield of stems, leaves, inflorescences, or roots, stability of 
seeds during storage, susceptibility of the seed to shattering, root hair length and 
quantity, intemode distances, or the quality of seed coat. 

4. The transgenic plant of claim 1 wherein the plant possesses an altered trait as 
compared to another plant wherein the trait is an alteration in a plant growth 
characteristic selected from the group consisting of: growth rate, germination rate of 
seeds, vigor of plants and seedlings, leaf and flower senescence, male sterility, 
apomixis, flowering time, flower abscission, rate of nitrogen uptake, osmotic 
sensitivity to soluble sugar concentrations, biomass or transpiration characteristics, 
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apical dominance, branching patterns, number of organs, organ identity, and organ 
shape or size. 



5. The transgenic plant of claim 1 wherein the plant possesses an altered trait as 
compared to another plant wherein the trait is an alteration in one or more 
characteristics selected from the group consisting of protein or oil production, seed 
protein or oil production, insoluble sugar level, soluble sugar level, and starch 
composition. 

6. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:860. 

7. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:802. 

8. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:240. 

9. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:274. 

1 0. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:558. 

1 1 . The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:24. 

12. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO: 1 120. 

13. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:44. 
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14. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:460. 



15. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:286. 

16. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO: 120. 

17. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO: 130. 

1 8. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO: 134. 

19. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ED NO:698. 

20. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO: 832. 

2 1 . The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:580. 

22. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:612. 

23. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:48. 

24. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:859. 
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25. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:801 . 



26. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:239. 

27. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO: 273. 

28. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:557. 

29. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:23. 

30. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO: 1 1 19. 

3 1 . The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:43. 

32. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:459. 

33. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:285. 

34. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:l 19. 

35. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO: 129. 
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36. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SBQ ID NO: 133. 



37. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:697. 

38. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:831. 

39. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:579. 

40. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:61 1. 

4 1 . The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:47. 

42. The transgenic plant of claim 1, further comprising a constitutive, inducible, 
or tissue-specific promoter operably linked to said nucleotide sequence. 

44. The transgenic plant of claim 1, wherein the plant is selected from the group 
consisting of: soybean, wheat, corn, potato, cotton, rice, oilseed rape, sunflower, 
alfalfa, sugarcane, turf, banana, blackberry, blueberry, strawberry, raspberry, 
cantaloupe, carrot, cauliflower, coffee, cucumber, eggplant, grapes, honeydew, 
lettuce, mango, melon, onion, papaya, peas, peppers, pineapple, pumpkin, spinach, 
squash, sweet corn, tobacco, tomato, watermelon, mint and other labiates, rosaceous 
fruits, and vegetable brassicas. 

44. The transgenic plant of claim 1 wherein the encoded polypeptide is expressed 
and regulates transcription of a gene. 
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45. A method of using the transgenic plant of claim 1 to grow a progeny plant 
from a parent plant, the method comprising crossing the transgenic plant with another 
plant, selecting seed, and growing the progeny plant from the seed. 

46. An isolated or recombinant polynucleotide comprising a nucleotide sequence 
selected from the group consisting of: 

(a) a nucleotide sequence encoding a polypeptide comprising a sequence selected 
from SEQ ID NOs: 240, 274, 558, 286, 698, and 832, or a complementary nucleotide 
sequence thereof; 

(b) a nucleotide sequence of SEQ ID NOs:239, 273, 557, 285, 697, 831, or a 
complementary nucleotide sequence thereof; and 

(c) a nucleotide sequence that hybridizes under stringent conditions to a nucleotide 
sequence of one or more of: (a) or (b). 

47. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQIDNO:240. 

48. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQIDNO:274. 

49. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQIDNO:558. 

50. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQIDNO:286. 

5 1 . The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQIDNO:698. 
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52. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQIDNO:832. 

53. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:239. 

54. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:273. 

55. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:557. 

56. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:285. 

57. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:697. 

58. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO: 83 1 . 

59. The isolated or recombinant polynucleotide of claim 46, further comprising a 
constitutive, inducible, or tissue-specific promoter operably linked to the nucleotide 
sequence. 

60. The isolated or recombinant polynucleotide of claim 46 wherein the encoded 
polypeptide is expressed and regulates transcription of a gene. 

61 . A vector comprising the isolated or recombinant polynucleotide of claim 46. 

62. A host cell comprising the vector of claim 61 . 
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63. A method of using the isolated or recombinant polynucleotide of claim 46 for 
producing a plant having a modified trait, the method comprising selecting a 
polynucleotide that encodes a polypeptide, inserting the polynucleotide into an 
expression vector, introducing the vector into a plant or a cell of a plant to 
overexpress the polypeptide, thereby producing a modified plant, and selecting a 
modified plant for a modified trait. 

64. The method of claim 63 wherein the plant possesses a modified trait as 
compared to another plant wherein the trait is an alteration in one or more physical 
characteristics selected from the group consisting of: the number of trichomes, fruit 
and seed size and number, yield of stems, leaves, inflorescences, or roots, stability of 
seeds during storage, susceptibility of the seed to shattering, root hair length and 
quantity, internode distances, or the quality of seed coat. 

65. The method of claim 63 wherein the plant possesses a modified as compared 
to another plant wherein the trait is an alteration in a plant growth characteristic 
selected from the group consisting of: growth rate, germination rate of seeds, vigor of 
plants and seedlings, leaf and flower senescence, male sterility, apomixis, flowering 
time, flower abscission, rate of nitrogen uptake, osmotic sensitivity to soluble sugar 
concentrations, biomass or transpiration characteristics, apical dominance, branching 
patterns, number of organs, organ identity, and organ shape or size. 

66. The method of claim 63 wherein the plant possesses a modified trait as 
compared to another plant wherein the trait is an alteration in one or more 
characteristics selected from the group consisting of protein or oil production, seed 
protein or oil production, insoluble sugar level, soluble sugar level, and starch 
composition. 

67. A modified plant produced by the method of claim 63. 

68. A method of using the plant of claim 67 to grow a progeny plant from a parent 
plant, the method comprising crossing the transgenic plant with another plant, 
selecting seed, and growing the progeny plant from the seed. 
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69. The plant produced by the method of claim 68. 
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SEQUENCE LISTING 



<110> Mendel Biotechnology, Inc. 
Ratcliffe, Oliver 
Riechmann, Jose Luis 
Adam, Luc J. 
Dubell, Arnpld T. 
Heard, Jacqueline E. 
Pilgrim, Marsha L. 
Jiang, Cai-Zhong 
Reuber, T. Lynne 
Creelman, Robert A. 
Pineda , Omaira 
Yu, Guo-Liang 
Broun, Pierre E . 

<120> YIELD -RELATED POLYNUCLEOTIDES AND 
POLYPEPTIDES IN PLANTS 

<130> 514442002041 

<150> 60/310,847 
<151> 2001-08-09 

<150> 60/336,049 
<151> 2001-11-19 

<150> 60/338,692 
<151> 2001-12-11 

<150> 10/171,468 
<151> 2002-06-14 

>G1275 (58.. 579) 

CCAAGAAAAGGGAAGATCACGCATTCTTATAGGCGTAATTCGTAAATAGTGGTGAGTATG 
AATGATGCAGACACAAACTTGGGGAGTAGTTTCAGCGATGATACTCACTCTGTGTTCGAG 
TTTCCGGAGCTAGACTTGTCAGATGAATGGATGGATGATGATCITGTGTCTGCGGTTTCC 
GGGATGAATCAGTCTTATGGTTATCAGACTAGTGATGTTGCTGGTGCTTTATTCTCAGGT 
TCTTCTAGCTGTTTC^GTCATCCTGAATCTCCAAGTACCAAAACTTATGTTGCTGCTACA 
GCCACTGCTTCTGCCGACAACC^VAAACAAGAAAGAAAAGAAAAAAATTAAAGGGAGAGTT 
GCGTTCAAGACACGGTCCGAGGTGGAAGTGCTTGACGACGGGTTCAAGTGGAGAAAGTAT 
GGGAAGAAGATGGTGAAGAA(^GCC CACATC CAAGAAACTACTACAAATGTTCAGTTGAT 
GGCTGTCCCGTGAAGAAAAGGGTTGAACGAGACAGAGATGATCCGAGCTTTGTGATAACA 
ACTTACGAGGGTTCCCACAATCACTCAAGC^TGAACTAAGACrCGA 

CGACCATGCTATATTCAGCACATCTTATTTTCTATGGTTACGAACGATACTTAAAACTGC 
TTCTAGTTCTTTATATCCATTGTAAACTGGTTGCAGGTTCACAAATTTTGAGAGGTTTAT 
GACATTCTAAATCTGTAGTACTTATATA 

>G1275 Amino Acid Sequence (domain in AA coordinates: 113-169) 
MNDADTNLGSSFSDDTHSVFEFPELDLSDEWMDDDLVSAVSGMNQSYGYQTSDVAGALFS 
GSSSCFSHPESPSTKTYVAATATASADNQNKKEKICKIKGRV^ 
YGKKMVKNSPHPRNYYKCSVDGCPVKKRVERDRDDPSFVITTYEGSHNHSSMN* 
>G1411 (110.. 856) 

TAAAGAAAAACTGAACAACCCTAAAGTACTGTATAAATCCTATATCAAATTTTTTTTTTG 
GAAGAAAAGGCTATATTTAAAAGT^AAATCAAGCAAAAGTAGATCCTCGGATGTATGGGAA 
GAGGCCTTTTGGAGGTGATGAATCTGAAGAAAGGGAAGAAGATGAGAACTTGTTCCCGGT 
CTTCTCGGCCCGATCTCAACACGACATGCGTGTTATGGTCTCGGCCTTGACTCAAGTAAT 
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CGGAAACCAACAAAGCAAATCTCATGATAACATCAGCTCTATTGATGATAACTATCCTTC 
TGTGTATAATCCACAAGACCCTAATC^VAC^^GTT^ 

CTTGAGGAGGAGACATTATAGAGGTGTAAGGCAAAGGCCATGGGGAAAGTGGGCAGCTGA 

AATCCGAGACCCAAAAAAGGCGGCACGTGTGTGGCTCGGGACATTTGAAACCGCTGAATC 

TGCGGCCTTAGCTTATGATGAAGCAGCCCTAAAGTTCAAAGGAAGCAAAGCAAAACTCAA 

TTTCCCGGAGAGGGTTCAGCTTGGAAGTAACTCTACATATTACTCCTCCAACCAAATTCC 

ACAAATGGAACCACAAAGTATACCGAACTATAATCAATACTATCATGATGCGAGTAGTGG 

TGATATGCTAAGTTTTAATTTGGGCGGTGGGTATGGGAGTGGTACCGGATATTCAATGTC 

TCATGATAATAGTACTACGACTGCTGCTACAACTTCTTCGTCTTCTGGTGGCTCTTCTAG 

GCAACAAGAAGAGCAAGATTATGCCAGATTCTGGCGCTTTGGGGATTCTTCT^ 

TCATTCGGGATATTAATTAGGAGATTTGATCAGTTACTTGTGATGAAGTAATGATACATT 

TCCCGTCAAAATTGAGATGATC^TATGCTTCCTGAATGTTTTTGAGTGTCATTT 

TCCGCGTTAAGATTTATTGAACGTGTTTTCTTGTTTTTTTGGTTAAAAAA 

AAAAAAAAAAAAAA 

>G1411 Amino Acid Sequence (domain in AA coordinates: 87-154) 

MYGKRPFGGDBSEEREEDENIiFPVFSARSQHDMRVMVSAIjTQVIGNQQSKSHDNISSIDD 

NYPSVYNPQDPNQQVAPTHQDQGDLRRRHYRGTOQRPWGKWA^ 

TAESAALAYDEAAIjKFKGS KAKLNFPERVQLGSNSTYYS SNQI PQMEPQS I PNYNQ YYHD 

ASSGDMLSFNLGGGYGSGTGYSMSHDNSTTTAATTSSSSGGSSRQQEEQDYARFWRFGDS 

SSSPHSGY* 

>G1488 (1..996) 

ATGGAAGATGAAGCACATGAATTCTTCGAGACATCT^ 

GTTGATTTCTCTAACGATGATGACGAAGAAAACGATGTTGTTGCTGATTCCACCACTACC 
ACCACCATAACCGACAGCTCTAACTTCTCCGCTGCTGATCTTCCCAGTTTCCACGGTGAT 
GTTCAAGACGGCACTAGCTTCTCCGGTGACCTTTGTATACCTTCTGATGATTTGGCTGAT 
GAGTTAGAGTGGCTTTCGAACATTGTGGATGAATCATTGTCGCCTGAAGATGTACACAAG 
CTCGAGCTAATATCCGGTTTTAAGAGTCGACCGGACCCGAAATCCGAtACCGGAAGCCCG 
GAAAACCCGAATAGCAGCAGTCCGATTTTTACTACCGACGTTTCTGTACCGGCCAAAGCT 
AGAAGC^y^CGCTCACGCGCCGCTGCGTGTAATTGGGCCTCACGTGGGCTTCTC^AGGAA 
ACGTTTTACGACAGTCCTTTCACCGGAGAAACCAT^ 

CCGCCAACCTCGCCGCCTTTGTTGATGGCTCCGCTAGGGAAAAAGCAAGCCGTTGATGGA 
GGACACCGACGGAAGAAGGATGTTTCTTCACCGGAGTCTGGTGGCGCAGAGGAGAGACGG 
TGTCTCCACTGCGCCACGGATAAGACTCCGCAATGGCGGACAGGCCCAATGGGCCCGAAG 
ACGTTGTGGAACGCTTGCGGTGTTAGGTACAAATCGGGACGTTTAGTGCCGGAGTATCGG 
CCCGCGGCGAGTCCGACGTTTGTGCTGGCGAAACACTCAAATTCTCATCGGAAAGTTATG 
GAGCTCCGGCGACAGAAGGAGATGAGTAGGGCCCATCATGAGTTCATACATCACCATCAC 
GGTACGGACACTGCCATGATTTTCGACGTTTCATCGGACGGTGATGATTACTTGATCCAC 
CACAACGTTGGCCCAGATTTCAGACAGCTTATTTGA 

>G1488 Amino Acid Sequence (domain in AA coordinates: 221-246) 
MEDEAHEFFHTSDFAVDDLLVDFSNDDDEENDWADSTTTTTITDSSNFSAADLPSFHGD 
VQDGTSFSGDLCIPSDDLADELEWLSNIVDESLSPEDVHKXiELISGFKSRPDPKSDTGSP 
ENPNSSSPIFTTDVSVPAKARSKRSRAAAC^ASRGLLKETFYDSPFTGETILSSQQHLS 
PPTSPPLLMAPIjGKKQAVDGGHRRKKDVSSPESGGAEERRCLHCATDKTPQWRTGPMGPK 
TIjCNACGVRYKSGRIiVPE YRPAAS PTFVIxAKHSNSHRKVMELRRQKEMSRAHHEF IHHHH 
GTDTAMI FDVS SDGDD YL IHHNVGPDFRQL I * 
>G1499 (159.. 833) 

TCGACTCCTTAATTGCATCACCAACCTAAC^ 
CCTTTAATATATATATATGCTCACACAC^ 

AAGCATTAAAATTTTTACGAACCAAACAAACAAAAATTATGAATAATTATAATATGAACC 
CATCTCTCTTCCAAAATTACACTTGGAAGAACAT 

AGAATGATGATCATCATCATCAACATAATAATGATCCAATCGGTATGGCCATGGACCAGT 
ACACACAGCTCCATATCTTCAATCCTTTCTCTTC^^ 

CCCTCACAACCACCACTCTTCTCTCCGGAGATCAAGAAGACGACGAAGACGAAGAAGAAC 
CTCTAGAGGAACTCGGTGCTATGAAGGAAATGATGTACAAGATCGCAGCCATGCAATCGG 
TTGACATCGACCCAGCAACCGTCAAGAAACCCAAACGCCGTAACGTGAGGATCTCCGACG 
ACCCTCAGAGTGTGGCGGCTAGACATCGCCGTGAGAGAATCAGTGAGAGGATCAGAATTC 
TTCAGAGACTCGTGCCAGGTGGCACTAAAATGGATACGGCTTCAATGCTCGATGAAGCTA 
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TACGCTATGTCAAGTTCTTGAAACGGCAGATCCGGCTACTCAATAATAATACCGGATATA 
CTCCTCCGCCGCCGCAAGATCAAGCTTCTCAGGCGGTGACGACGTCATGGGTTTCACCGC 
C^CCACCGCCAAGTTTCGGCCGTGGGGGAAGAGGAGTAGGAGAATTAATCTAGACAAGAT 
GACATTTCCATTAGTAGTAACTAAATTATGCTATAATGTGTGAGTAATGGTGCAATTATG 
GA 

>G1499 Amino Acid Sequence (domain in AA coordinates: 118-181) 

mktotyn^psijFqny^ 

hfpplsssiittttllsgdqeddedeeepleelgamkemmykiaamqsvdidpatvkkpkr 

RNVRISDDPQSVAARHRRERISERIRILQRLVPGGTKMDTASN^ 
IiNNNTGYTPPPPQDQASQAVTTSWVS PPPPPSFGRGGRGVGELI * 
>G1543 (1..828) 

ATGATAAAACTACTATTTACGTACATATGCACATACACATATAAACTATATGCTCTATAT 

CATATGGATTACGCATGCGTGTGTATGTATAAATATAAAGGCATCGTCACGCTTCAAGTT 

TGTCTCTTTTATATTAAACTGAGAGTTTTCCTCTCAAACTTTACCTTTTCTTCTTCGATC 

CTAGCTCTTAAGAACCCTAATAATTCATTGATCAAAATAATGGCGATTTTGCCGGAAAAC 

TCTTCAAACTTGGATCTTACTATCTCCGTTCCAGGCTTCTCTTCATCCCCTCTCTCCGAT 

GAAGGAAGTGGCGGAGGAAGAGACCAGCTAAGGCTAGACATGAATCGGTTACCGTCGTCT 

GAAGACGGAGACGATGAAGAATTCAGTCACGATGATGGCTCTGCTCCTCCGCGAAAGAAA 

CTCCGTCTAACCAGAGAACAGTCACGTCTTCTTGAAGATAGTTTCAGACAGAATCATA^ 

CTTAATC CCAAACAAAAGGAAGTAC TTGCCAAGCATTTGATGCTACGGCCAAGACAAATT 

GAAGTTTG GTTTCAAAACCGTAGAG GAAGGAGCAAATTGAAGCAAACCGAGATGGAATGC 

GAGTATCTCAAAAGGTGGTTTGGTTCATTAACGGAAGAAAACCAQ 

GTAGAAGAGCTTAGAGCCATAAAGGTTGGCCCAACAACGGTGAACTCTGCCTCGAGCCTT 

ACTATGTGTCCTCGCTGCGAGCGAGTTACCCCTGCCGCGAGCCCTTCGAGGGCGGTGGTG 

CCGGTTCCGGCTAAGAAAACGTTTCCGCCGCAAGAGCGTGATCGTTGA 

>G1543 Amino Acid Sequence (domain in AA coordinates: 135-195) 

MIKILIiFTYI CTYTYKLYALYHKDYACV CMYKYKG IVTLQVCLF YI KLRVFLSNFTFS SSI 

IaALKNPNNSIiIKIMAILPENSSNLDLTISVP 

EDGDDEEFSHDDGS APPRKKLRIiTREQS RLLEDS FRQNHTLNPKQKEVLAKHLMLRPRQ I 
EVWFQNRRARSKLKQTEMECEYLKRWFGSLTEENHRLHREVEELRAI KVGPTTVNS AS SL 
TMCPRCERVTPAASPSRAWPVPAKKTFPPQERDR* 
>G1635 (1..1164) 

ATGGCGTCGTCTCCGTTGACTGCAAATGTTCAGGGTACCAACGCTTCTTTGAGGAATAGA 

GATGAAGAAACTGCAGACAAGCAGATACAATTCAATGACCAAAGTTTTGGGGGAAA 

TATGCACCCAAGGTACGGAAGCCATACACGATAACAAAAGAGAGAGAGAGATGGACAGAT 

GAAGAGCACAAGAAGTTTGTTGAAGCCTTGAAATTATACGGGCGAGCTTGGAGACGAATA 

GAAGAACATGTGGGCTCAAAGACCGCAGTTCAGATTCGAAGC(^ 

TCTAAGGTTGCTCGAGAAGCAACTGGAGGTGATGGGAGCTCAGTAGAGCCGATTGTAATA 

CCTCCTCCTCGTCCCAAGAGAAAGCCAGCGCATCCGTACCCTCGTAAGTTTGGGAACGAG 

GCAGATCAAACAAGTAGATCGGTTTCTCCCTCAGAATC 

GTGTTGTCCACTGTTGGATGAGAAGCATTGTGTTCCC 

AGCTTGTCCCCAGTTTCTTCTGCATCACCACCAGCT 

CCTGAAGAGCTTGAGACTCTGAAGCTGGAGTTGTTTCCTAGTGAGAGACTCTTAAACAGG 
GAGAGCTCGATCAAGGAACCAACGAAGCAAAGTCTTAAACTCTTTGGGAAGACAGTTTTG 
GTATCTGATTCAGGCATGTCCTCTTCTCTAACAACTTCAACATATTGTAAATCCCCAATT 
CAGCCATTACCACGGAAACTCTCATCATCCAAGACACTACCCATAATAAGAAACTCACAA 
GAAGAACTCTTGAGCTGCTGGATACAAGTCCCTCTTAAGCAAGAAGATGTGGAAAATAGA 
TGTTTGGATTCAGGAAAGGCTGTCCAAAACGAAGGATCATCGACTGGATCAAACACTGGT 
TCGGTGGATGATACGGGACACACGGAAAAGACCACAGAACCCGAAACAATGCTATGTCAA 
TGGGAGTTTAAACCAAGTGAGAGGTCTGCATTTTCTGAGCTCAGAAGAACAAACTCCGAG 
TCAAATTGAAGAGGATTTGGTCCATACAAGAAGAGAAAGATGGTAACAGAAGAAGAAGAG 
CATGAGATTCATCTCCACTTATAA 

>G1635 Amino Acid Sequence (domain in AA coordinates: 44-104) 
MASSPLTANVQGTNASLRttRDEETADKQIQFND 

EEHKKFVEALKLYGRAWRRIEEHVGSKTAVQIRSHAQKFFSKVAREATGGDGSSVEPIVI 
PPPRPKRKPAHPYPRKFGNEADQTSRSVSPSERDTQSPTSVLSTVGSEALCSIiDSSSPNR 
SLSPVSSASPPAALTTTANAPEELETIiKljEIiFPSERLIJJRESSIKEPTKQSIjKL 
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VSDSGMSSSLTTSTYCKSPIQPLPRKLSSSKTLPIIRNSQEELLSCWIQVPLKQEDVENR 
CLDSGKAVQNEGSSTGSNTGSVDDTGHTEKTTEPETMLCQWEFKPSERSAFSELRRTNSE 
SNSRGFGPYKKRKMVTEEEEHEIHLHL* 
>G1794 (160. .1335) 

TCTTTCT1TCTTCCTCTTTGTCTCTGTTTCTTGTTTCTCTCTCTCTCTCTCTACAGAGTT 
TTCTTTCCCTCGAAGAAAAAGAATATTTTTAAATTTAATTTTCTCTGCGTTTATAAGCTT 
TAAGTTTCAGAGGAGGAGGATTTAGAAGGAGGGTTTTGTATGTGTGTCTTAAAAGTGGCA 
AATCAGGAAGATAACGTTGGCAAAAAAGCCGAGTCTATTAGAGACGATGATCATCGGACG 
TTATCTGAAATCGATCAATGGCITTACTTATTCGCAGCCGAAGACGACCACCACCGT 
AGCTTCCCTACGCAGCAGCCGCCTCCATCGTCGTCGTCCTCATCTCTTATCTCAGGTTTC 
AGTAGAGAGATGGAGATGTCTGCTATTGTCTCTGCTTTGACTCACGTTGTTGCTGGAAAT 
GTTCOTCAGCATCAACAAGGAGGCGGTGAAGGTAGCGGAGAAGGGACTTCGAATTCGTCT 
TCTTCCTCGGGGCAGAAAAGGAGGAGAGAGGTGGAGGAAGGTGGCGCCAAAGCGGTTAAG 
GCAGCTAATACTTTGACGGTTGATCAATATTTCTCCGGTGGTAGCTCTACTTCTAAAGTC 
AGAGAAGCTTCGAGTAACATGTCAGGTCCGGGCCCAACATACGAGTATACAACTACGGCA 
ACTGCTAGTAGCGAAACGTCGTCGTTTAGTGGGGACCAACCTCGGCGAAGATACAGAGGA 
GTTAGACAAAGACCATGGGGAAAGTGGGCGGCTGAGATTCGAGATCCATTTAAAGCAGCT 
AGAGTTTGGCTCGGTACGTTCGACAATGCTGAATCAGCAGCAAGAGCTTACGACGAAGCT 
GCACTTCGGTTTAGAGGCAACAAAGCCAAACTCAACTTCCCTGAAAACGTCAAACTCGTT 
AGACCTGCTTC^UVCCGAAGC^CAACCTGTGC^CC^U^ACCGCTGCTCAAAGACCGACCCAG 
TCAAGGAACTCGGGTTCAACGACTACCCTTTTGCCCATAAGACCTGCTTCGAATCAAAGC 
GTTCATTCGCAGCCGTTGATGCZAATCATACAACTTGAGTTACTCTGAAATGGCTCGTCAA 
CAACAACAGTTTCAGCAACATCATCAACAATCTTTGGATTTATACGATCAAATGTCGTTT 
CCGTTGCGTTTCGGTCAC^CTGGAGGTTCAATGATGC^^TCTACGTCGTC^TCATCATCT 
CATTCTCGTCCTCTGTTTTCCCCGGCTGCTGTTCAGCCGCCACCAGAATCAGCTAGCGAA 
ACCGGTTATCTCCAGGATATACAATGGCCATCAGACAAGACTAGTAATAACTACAATAAT 
AGTCCATCCTCCTGATGACTTGCTTCATTTTATTTGTTTCACTATAGAGTAATAGAAAAC 
AGGAAAATGATTATATGTTATAGAGTTATTTTTCCAAATATTATAGGGTTTAGGTTGTTT 
GTATTGTTCTGCTTTCATCCTCTCATGCTTTTTTTCTTAATTTATTATATTTTTGCATTA 
' TAATTTCGTTTCATTGTAACAAACATTAAA^ 
AG 

>G1794 Amino Acid Sequence (domain in AA coordinates: TBD) 
MCVLKVANQEDNVGKKAESIIUDDDHRTLSEI^ 

SSLISGFSREMEMSAIVSALTHVVAGNVPQHQQGGGEGSGEGTSNSSSSSGQKRRREVEE 

GGAKAVKAAl^LTVDQYFSGGSSTSKVREASSNMSGPGPTYEYTTTATASSETSSFSGDQ 

PRRRYRGTOQRPWGKWAAEIRDPFKAARWLGTFDNAESAARAYDEAA^ 

PENVKLVRPASTEAQPVHQTAAQRPTQSRNSGSTTTLLPIRPASNQSVHSQPLMQSYWLS 

YSEMARQQQQFQQHHQQSLDIiYDQMSFPLRFGHTGGSMMQSTSSSSSHSRPLFSPAAVQP 

PPESASETGYLQDIQWPSDKTSNNYNNSPSS* 

>G1839 (38. .592) 

ATCACAGTTATGTTTCCATTCATTGGCTATAAAAACCATGCTCACTCCCTTTTGTTCTTC 
ACACCATTTGCAGGAAAAAATGAATAGTTGTCAGTCTAATCCCACCAAAATGGATAATTC 
AGAAAATGTTCTATTTAATGATGAAAACGAAAATTTCACA^ 

TTCTTCGTACTTGACAAGAGATCAAGAGC^CGAGATC^TGGTOTCTGCTCTGCGAC^ 

GATATCTAACTCCGGAGCTGACGACGCX5TCATCATCAAACTTGATCATCACAAGCGTTCC 

GCCTCCAGACGCTGGCCCTTGTCCTCTCTGTGGCGTCGCCGGTTGCTACGGCTGCACATT 

ACAACGGCCGCACCGAGAGGTAAAGAAGGAGAAGAAATACAAAGGAGTAAGGAAAT^AACC 

ATCGGGTAAATGGG€GGCGGAGATATGGGATCCGAGATCGAAATCAAGGAGGTGGCTTGG 

AACGTTTCTTACGGCGGAGATGGCGGCACAATCTTACAATGATGCGGCGGCTGAGTATCG 

AGCAAGACGTGGTAAAACAAACGGAGAAGGAATTAAACGGCGGTGGAGATGACTGAGAAG 

GACATGGTCGGTGATCATACACGGCGAGGTGGAAATGTTATATTTACTATTGAAAACTAA 

ATTATTTATTATAGAGGGAGATATTACTCTTTACGCTTTCATTAAGATTTATTTTTATAA 

GTTTTAAAGTATTTTATTGTTATAAAAAAAAAAAAAAAAAAAAAA 

>G1839 Amino Acid Sequence (domain in AA coordinates: TBD) 

MIjTPFCSSHHLQEKMNSCQSNPTKMDNSENVLFNDQNENFTLVAPHPSSSYLTRDQEHEI 

MVSAIiRQVT SNSGADDAS SSNLI ITSVPP PDAGPCPLCGVAGCYGCTLQRPHREVKKEKK 

YKGVRJCKPSGKWAAE I WDPRS KSRRWLGTFLTAEMAAQS YNDAAAE YRARRGKTNGEG IK 
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RRWR* 

>G2108 (35.. 694) 

GAGAGAGAAACATTGATCTCTGAATATTGTGAACATGTTGAAATCAAGTAACAAGAGAAA 

AAGCAAAGAAGAGAAGAAGTTACAAGAAGGGAAGTACCTTGGAGTGAGGAGACGTCCATG 

GGGAAGATATGCAGCTIXSAAATC^GAAACCCTTTTACTAAAGAAAGACATTGGCTTGGAAC 

GTTTGATACAGCCGAAGAAGCTGCTTTTGCATATGACGTTGCTGCTCGATCCATCAGCGG 

CTCTCTAGCTACAACAAACTTCTTCTACACTGAAAACACCTCTTTAGAAAGACATCCACA 

ACAGTCTTTGGAGCCTCATATGACTTGGGGATCTTCTAGTCTCTGTCTTCTTCAAGATCA 

GCCTTTTGLAAAACAACCATTTTGTTGCTGATCCTATCTCTTCTTCTTTTTCTCAAAAACA 

AGAGTCTTCTACCAATCTCACTAACACTTTCTCACATTGTTATAATGATGGTGAT 

TGGCCAAAGCAAAGAGATTTCTTTACCTAATGATATGTCAAACAGTTT^ 

GGAGAAAGTCGGTGAACATGACAATGCAGACCATATGAAGTTTGGCT 

CGAACCTCTCTGCTTTGAGTATGACTACATTGGGAATTATCTTCAGAGTTTTCTCAAAGA 

TGTCAACGACGATGCTCC^CAGTTTCriTATGTGAGCTTGTATTACCGA 

TG 

>G2108 Amino Acid Sequence (domain in AA coordinates: 18-85) 
MLKS SNKRKS KEEKKIjQEGKYLGVRI^PWGRYAAE IRNPFTKBRHWIiGTFDTAEEAAFAY 
DVAARSISGSIJVTTNFFYTENTSLERHPQQSLEPH^^ 

I S S S FSQKQES STNIiTNTFSHCYNDGDHVGQS KB I SLPNDMSNS IiFGHQDKVGEHDNADH 

MKFGSVLSDEPLCFEYDYIGNYLQSFLKDVNDDAPQFLM* 

>G2291 (27.. 797) 

GCTTTCTCACCTTTATAAAATAGAAAATGGAAAACAGCTACACCGTTGATGGTCACCGTC 

TTCAATATTCCGTTCCGTTAAGCTCCATGCATGAAACCAGTCAAAACTCCGAAACTTACG 

GATTATCCAAAGAGTCGCCGTTGGTCTGCATGCCCTTGTTCGAAACCAACACTACTTCAT 

TCGATATCTCTTCTCTTTTCTCGTTTAACCCAAAACCAGAACCCGAAAACACGCATCGTG 

TCATGGACGATTCCATCGCCGCCGTCGTGGGCGAAAACGTTCTTTTCGGTGATAAAAACA 

AAGTCTCTGATCACTTGACCAAAGAAGGTGGTGTGAAGCGGGGGCGGAAGATGCCGCAGA 

AGACCGGAGGATTCATGGGAGTGAGAAAACGGCCGTGGGGGAGATGGTCGGCGGAGATAA 

GAGACAGGATAGGGCGGTGCAGACACTGGTTAGGAACGTTCGACACGGCGGAAGAGGCAG 

CGCGTGCGTATGACGCGGCGGCGAGGAGGCTTAGAGGGACCAAAGCCAAGACCAATTTCG 

TGATTCCTCCGCTTTTTCCCAAGGAAATAGCTCAGGCTCAGGAGGATAATAGGATGAGGC 

AGAAGCAGAAGAAGAAGAAGAAGAAAAAAGTGAGTGTGAGGAAGTGTGTTAAAGTCACAT 

CGGTTGCACAGTTGTTCGATGATGCCAATTTTATAAATTCTTCTAGTATT7VAAGGAAATG 

TGATTAGTTCTATTGATAATCTTGAAAAAATGGGTCTAGAGCTTGATTTGAGTTTAGGGT 

TGTTGTCTAGGAAGTGATAAAGCACTCGTAGTTAAGTAGTTGTAGTT 

>G22 91 Amino Acid Sequence (domain in AA coordinates: TBD) 

MENSYTVDGHRLQYSVPLSSMHETSQNSETYGLSKESPLVCMPLFETNTTSFDISSLFSF 

NPKPEPENTHRVMDDSIAAVVGENVTjFGDKNKVSDHL 

KRPWGRWS AE IRDRI GRCHUIWIjGTFDTAEEAAIIAYDAAARRLRGTKAKTNFVI PPLFPKE 
IAQAQEDNRMRQKQKK3CKKEOCVSVRKCVKVTSVAQLFDDANFINS 
KMGLELDLSLGLLSRK* 
>G2452 (1..804) 

ATGTCATCGTCGACGATGTACAGAGGAGTTAATATGTTTTGACCGGCAAACA 
ATTTTTGAAGAAGTCAGAGAAGCCACGTGGACGGCGGAGGAG 

GCTCTCGCTTATCTGGACGACAAAGACAATCTTGAGAGCTGGTCCAAGATCGCAGATTTG 
ATTCCCGGCAAAACAGTAGCTGACGTCATTAAACGATACAAGGAGCTAGAGGATGATGTC 
AGCGACATCGAAGCCGGACTTATCCCCATTCCGGGATACGGCGGCGACGCCTCCTCCGCT 
GCAAACAGTGACTA^TTCTTTGGTCTAGAAAACTCGAGCTACGGTTATGATTACGTCGT^ 
GGAGGAAAGAGGAGTTCGCCGGCGATGACTGATTGTTTTAGGTCTCCGATGCCGGAAAAG 
GAGAGGAAGAAAGGAGTTCCGTGGACCGAGGACGAACACCTACGATTTCTGATGGGTTTG 
AAGAAATATGGAAAAGGAGATTGGAGAAACATAGCAAAAAGCTTTGTGACGACTCGAACG 
CCGACGCAAGTCGCTTCACACGCTCAGAAATATTTTCTTCGACAACTCACAGATGGTAAA 
GACAAAAGACGATCAAGTATTCACGATATCACCACTGTTAACATCCCTGACGCAGACGCA 
TCCGCAACCGCCACGACCGCTGACGTAGCACTCTCTCCTACTCGAGCCAATTCTTTTGAC 
GTTTTCCTTCAGCCAAATCCTCATTACAGrTTCGCGTCTGCGTCTGCGTOTAGCTATTAT 
AATGCGTTTCCGCAGTGGAGTTAA 

>G2452 Amino Acid Sequence (conserved domain in AA coordinates : 27-213) 
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MS S STMYRGVNMFS PANTNWI FQEVREATWTAEENKRFEKALAYIjDDKDNLESWS KIADL 
IPGKTVADVIKRYKEIiKDDVSDIEAGLIPIPGYGGDASSAANSDYFFGIjENSSYGYDYVV 
GGKRSS PAMTDCFRS PMPEKERKKGVPVTTEDEHLRFLiMGLKKYGKGDWRNIAKSFVTTRT 
PTQVASHAQKYFLRQLTDGKDKRRSS IHDITTVNIPDADASATATTADVALS PTPANSFD 
VFLQPNPHYS FAS AS AS S YYNAFPQWS * 
>G2509 (143.. 934) 

ATATATTCCCTCTTTGATTCTCCTTCTTC^ 

CCTCAATTCCAAATCTTAAACCCTAAATTTACAGACACAATCGAGATCACCTGAAAAAAG 
AGGTTTAAAGATTTTAGCAAAGATGGCGAATTCAGGAAATTATGGAAAGAGGCCCrrrCG 
AGGCGATGAATCGG ATG AAAAGAAAGAAGC CG ATG ATG ATGAGAACATATTC C CTTT CTT 
CTCTGCCCGATCCCAATATGA(^TGCGTGC<^TGGTCTCAGCCTTGACTCAAGTCATTGG 
AAACCAAAGCAGCTCTCATGATAATAACCAA.CATCAACCTGTTGT 

TCCTAACCCACCGGCTCCTCCAACTCAAGATCAAGGGCTATTGAGGAAGAGGCACTATAG 
AGGGGTAAGACAACGACCATGGGGAAAGTGGGCAGCTGAAATTCGGGATCCGCAAAAGGC 
AGCACGGGTGTGGCTCGGGACATTTGAGACTGCTGAAGCTGCGGCTTTAGCTTATGATAA 
CGCAGCTCTTAAGTTCAAAGGAAGCAAAGCCAAACTCAATTTCCCTGAGAGAGCTC^^C^ 
AGCAAGTAACACTAGTACAACTACCGGTCCACCAAACTATTATTCTTCTAATAATCAAAT 
TTACTACTCAAATCCGCAGACTAATCCGCAAACC^TACCTTATTTTAACCAATACTACTA 
TAACCAATATCTTCATCAAGGGGGGAATAGTAACGATGCATTAAGTTATAGCTTGGCCGG 
TGGAGAAACCGGAGGCTCAATGTATAATCATCAGACGTTATCTACTACAAATTCTTCATC 
TTCTGGTGGATCTTCAAGGCAACAAGATGATGAACAAGATTACGCCAGATATTTGCGTTT 
TGGGGATTCTTCACCTCCTAATTCTGGTTTTTGAGATCTTCAATAAACTGATAATAAAGG 
ATTTGGGTCACTTGTTATGAGGGGATCATATGTTTTCTAA 

>G2509 Amino Acid Sequence (domain in aa coordinates: 89-156) 
MANSGNYGKRPFRGDESDEKKEADDDENIFPFFS ARS QYDMRAMVSAIiTQVI GNQS SSHD 
NNQHQPWYNQQDPNPPAPPTQDQGLLRKRHYRGW 

FETAEAAAIiAYDNAALKFKGSKAKLNFPERA snnqi yysnpqt 

NPQTI PYFNQYY^QYLHQGGNSNDALS YSLAGGETGGSMYNHQTLS TTNS S S SGGS SRQ 
QDDEQDYARYLRFGDSSPPNSGF* 
>G390 (1..2526) 

ATGATGGCTCATCACTCCATGGACGATAGAGACTCTCCTGATAAAGGATTTGATTCCGGC 

AAGTACGTTAGATACACGCCGGAACAAGTTGAAGCTCTTGAGAGAGTTTATGCTGAGTGT 

CCTAAACCTAGCTCTCTGAGAAGACAACAGCTTATTCGTGAATGTCCCATTCTCTGTAAC 

ATCGAGCCTCGACAGA.TCAAAGTTTGGTTCCAGAATCGCAGATGTCGAGAGAAGCAGAGG 

AAAGAGTCAGCTCGTCTTCAGACAGTGAACAGGAAGCTGAGTGCTATGAACAAGCTTTTG 

ATGGAAGAGAATGATCGTTTGCAGAAGCAAGTCTCCAACTTGGTTTATGAGAATGGATTC 

ATGAAACATCGAATCCACACTGCTTCTGGGACGAC 

GTCGTGAGTGGTCAGCAACGTCAGCAGCAAAACCC 

GTTAACAACCCAGCTAATCTTCTCTCGATTGCGGAGGAGACCTTGGCGGAGTTCCTTTGC 

AAGGCTACAGGAACTGCTGTCGACTGGGTCCAGATGATTGGGATGAAGCCTGGTCCGGAT 

TCTATTGGTATCGTAGCTGTTTCACGCAACTGCAGTGGAATAGCAGCACGTGCCTGTGGC 

CTCGTGAGTTTAGAACCCATGAAGGTCGCTGAAATCCTCAAAGATCGTCCATCTTGGTTC 

CGTGACTGTCGATGTGTCGAGACTCTGAATGTTATACCCACTGGAAATGGTGGTACTATC 

GAGCTTGTCAACACTCAGATTTATGCTCCTACAACATTAGCAGCAGCTCGTGACTTTTGG 

ACGCTGAGATATAGTACAAGTCTAGAAGATGGAAGCTATGTGGTCTGTGAGAGATCACTC 

ACTTCTGCAACTGGTGGCCCCAATGGTCCACTTTCTTCAAGCTTCGTGAGAGCCAAAATG 

CTGTCAAGCGGGTTTCTTATCCGTCCTTGTGATGGTGGTGGTTCCATTATTCACZATCGTT 

GATCATGTGGACTTGGATGTCTCAAGTGTTCCTGAAGTCCTCAGGCCTCTTTATGAGTCT 

TCCAAAATCCTTGCTCAAAAAATGACTGTCGCTGCTCTGAGACATGTGCGCCAAATTGCT 

CAAGAGACTAGTGGAGAAGTCCAGTATAGTGGTGGACGCCAGCCTGCAGTTTTAAGGACT 

TTCAGCCAGAGACTCTGCCGGGGTTTCAATGATGCTGTAAATGGTTTTGTCGATGATGGA 

TGGTCTCCAATGAGTAGTGATGGAGGAGAGGATATTACGATCATGATTAACTCTTCCTCT 

GCTAAATTTGCTGGCTCCCAATACGGTAGCTCATTTCTTCCAAGTTTTGGAAGTGGTG^ 

CTCTGTGCCAAAGCTTCTATGCTGTTGCAGAATC 

CTGAGAGAACACCGAGCTGAATGGGCAGACTATGGTGTCGATGCCTATTCTGCTGCATCT 
CTCAGAGCAACTCCATATGCTGTTCCATGCGTCAGAACCGGTGGGTTCCCGAGTAACCAA 
GTCATTCTTCCTCTCGCA(^GACACTCGAACATGAAGAGTTTCTCGAAGTGGTTAGACTT 
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GGAGGTCATGCTTACTCACCTGAAGACATGGGCTTATCCCGGGATATGTATTTACTGCIAG 
CTTTGTAGCGGCGTTGATGAAAATGTGGTTGGAGGTTGTGCTCAGCTTGTCTTTGCCCCA 
ATCGATGAATCATTTGCTGATGATGCACCTTTGCITCCTTCTGGTTTCCGTGTCATACCA 
CTCGACCAAAAAACAAATCCGAATGATCATCAATCTGCAA 

TCGTCCCTAGATGGTTCCACCAAAACCGATTCGGAAACAAACTCTAGATTGGTCTTAACA 

ATAGCCTTCCAGTTCACGTTTGATAACCATTCCAGAGACAATGTTGCTACAATGGCGAGA 

CAGTATGTGAGGAACGTTGTTGGTTCGATTCAGAGAGTGGCTCTAGCCATTACGCCTCGT 

CCTGGCTCAATGCAACl^CCCACTTCCCCTGAAGCn'CTCACTCTTGTCCGTTGGATC^ 

CGTAGTTACAGTATTCATACAGGTGCAGATCTGTTTGGAGCTGATTCTCAGTCCTGTGGA 

GGAGACACATTGCTTAAGCAACTCTGGGACCATAGTGATGCCATATTGTGCTGCTCCCTG 

AAAACTAATGCCTCACCGGTATTCACATTTGCAAACCAAGCTGGTT^ 

ACTACACTTGTGGCACTTCAGGATATAATGCTCGACAAAACACTTGATGACTCTGGTCGT 
AGAGCTCTTTGCTCCGAGTTCGCCAAGATCATGCAGCAGGGATATGCGAATCTTCCGGCA 
GGAATATGTGTGTCGAGCATGGGCAGACCGGTTTCGTATGAGCAAGCGACGGTGTGGAAA 
GTTGTTGATGACAACGAATCAAACCACTGCTTGGCTTTTACCCTCGTTAGTTGGTCGTTT 
GTTTGA 

>G390 Amino Acid Sequence (domain in AA coordinates: 18-81) 
MMAHHSMDDRDS PDKGFDSGKYVRYTPEQVEALERVYAECPKPS SliRRQQIi IRECPI LCN 
IEPRQIKAmFQNRRCREKQRKESARIiQTVNRK^ 

MKHRIHTASGTTTDNS CESVVVSGQQRQQQNPTHQHPQRDVNlsIPANIjIiS IAEETLAEFIiC 

KATGTAVDWQMIGMKPGPDSIGIVAVSRNCSGIAARACGLVSLEPMKVAEILKDRPSWF 

RDCRCVETLNVI PTGNGGTIELVNTQIYAPTTIiAAARDFWTLRYSTSLEDGSYWCERSI, 

TSATGGPNGPIiSSSFVRAKMLSSGFIilRPODGGGSIIHIVDHVDLDVSSVPEVI^ 

SKIItAQKMTVAAXiRHVRQIAQETSGEVQYSGGRQPAVIiRTFSQRlACRGFNDAVNGFVDDG 

WSPMSSDGGEDITIMINSSSAKFAGSQYGSSFLPSFGSGVLCAKASMLLQNVPPLVLIRF 

LREHRAEWADYGVDAYSAASLRATPYAVPCVRTGGFPSNQVILP^ 

GGHAYS PEDMGLSRDMYIjLQLCSGVDENWGGCAQIiVFAP IDES FADDAPIiLPSGFRVT P 
LDQKTNPlTOHQSASRTRDIASSIiDGSTKTDSETO 

QYVRNVVGS IQRVAIiAITPRPGSMQLPTSPEALTLVRWITRS YS IHTGADLFGADSQSCG 
GDTLLKQLWDHSDAILCCSLKTNASPOT 

RALCSEFAKIMQQGYANLPAGICVSSMGRPVSYEQATWKVTO^ 
V* 

>G391 (1..2559) 

ATGATGATGGTCCATTCGATGAGCAGAGATATGATGAACAGAGAGTCGCCGGATAAAGGG 
TTAGATTCCGGCAAGTATGTGAGGTACACGCCGGAGCAAGTGGAAGCTCTCGAGAGAGTT 
TACACTGAGTGTC CTAAG CCAAGTTCTCTAAGAAGACAACAACTCATACGTGAATGTCC G 
ATTCTCTCTAACATCGAGCCTAAGCAGATCAA 

GAGAAGC^GAGGAAAGAAGCTGCTCGTCTTCT^AACAGTGAACAGAAAACTCAATGCCATG 
AACAAACTCTTGATGGAAGAGAATGATCGTTTGCAGAAGCAAGTTTCTAACTTGGTCTAT 
GAGAATGGCCACATGAAACATCAACTTCACACTGCTTCT^ 
TGTGAGTCTGTGGTCGTGAGTGGTCAGCAACATCAA 

CAGC^CGTGATGCTAACAACCCAGCAGGACTCCTTTCTATAGC^GAGGAGGCCCTAG^ 
GAGTTCCT?TTCCAAGGCTACAGGAACTGCTGTTGACTGGGTTCAGATGATTGGGATGAAG 
CCTGGTCCGGATTCTATTGGCATAGTCGCTATTTCGCG 

CGTGCCTGCGGCCTCGTGAGTTTAGAACCCATGAAGGTTGCTGAAATTCTCAAAGATCGT 
CCATCTTGGCTCCGAGATTGTCGAAGTGTGGATACTCTGAGTGTGATACCTGCTGGAAAC 
GGTGGGACGATCGAGCTTATTTACACGCAGATGTATGCTCCTACGACTTTAGCAGCAGCT 
CGTGACTTTTGGACGCTGAGATATAGCACATGTTTGGAAGATGGAAGCTATGTGGTTTGT 
GAAAGGTCGCTTACTTCTGCAACTGGTGGCCCCACTGGGCCACCTTCTTCAAACTTTGTG 
AGAGCTGAAATGAAAGCAAGCGGGTTTC^CATCCGTCCTTGCGATGGTGGTGGTTCCATT 
CTCCACATTGTTGATCATGTTGATCTGGATGCCTGGAGTGTCCCTGAAGTCATGAGGCCT 
CTCTATGAATCATCGAAGATTCTTGCTCAGAAAATGACTGTTGCTGCTTTGAGACATGTA 
AGACAAATTGCACAAGAAACAAGTGGAGAAGTTCAGTATGGTGGAGGGCGCCAACCTGCG 
GTTTTAAGAACCTTCAGTCAAAGACTCTGTCGGGGTTTCAATGATGCTGTTAATGGTTTT 
GTGGATGATGGATGGTCACCAATGGGTAGCGATGGTGCAGAGGATGTTACTGTAATGATA 
AACTTGTCCCCTGGGAAGTTTGGTGGGTCTCAGTACGGTAATTCATTCCTTCCAAGCTTT 
GGTAGTGGCGTGCTTTGTGCCAAGGCATCTATGTTGCTTCAGAACGTTCCACCCGCTGTG 
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CTGGTTCGATTCCTTAGAGAACACCGATCTGAATGGGCTGATTATGGCGTGGATGCTTAT 
GCTGCTGCATCGCTCAGAGCAAGTCCTTTTGCTGTTCCTTGTGCTAGAGCTGGGGGGTTC 
CCAAGTAACCAAGTCATTCTTCCTCTTGCGCAGACAGTTGAACATGAAGAGTCACTTGAG 
GTGGTTAGACTTGAAGGTCACGCTTACT(^CCCGAAGACATGGGTTTAGCTCGGGATATG 
TATTTGCTACAGCTTTGTAGCGGTGTTGATGAAAATGTGGTTGGAGGTTGTGCACAGCTT 
GTATTTGCCCCTATCGATGAATGATTTGCTGA 

CGCATCATACCTCTTGAACAGAAATCTACTCCGAACGGTGCATCTGCAAACCGTACCCTG 
GATTTAGCCTCAGCTTTAGAAGGATCCACACGTCAAGCTGGTGAAGCCGACCCAAATGGC 
TGTAACTTTAGGTCGGTACTAACCATAGCATTCC^ 

GA(^GTGTTGCTTCAATGGCACGTCAGTACGTGCGAAGCATAGTAGGATCGATTCAGAGG 
GTTGCTCTAGCCATTGCTCCTCGTCCTGGCT 

TCCCCTGAAGCTCTCACTCTGGTCCGTTGGATCTCCCGGAGTTACAGCCTTCACACTGGT 

GCAGATCTCTTTGGATCTGATTCTCAAACCAGTGGTC 

AATC^CTCTGATGCAATCTTGTGCTGCTCCCTCAAAACAAACGCTO 

TTCGCAAACCAAACCGGTTTAGACATGCTGGAAACGACTCTTGTAGCCCTTCAAGACATA 
ATGCTAGACAAGACCCTTGACGAACCTGGTCGTAAAGCTCTTTGCTCTGAGTTCCCCAAG 
ATC^TGCAACAGGGCTATGCTCATCTGCCGGCAGGAGTATGTGCGTCAAGCATGGGAAGG 
ATGGTATCTTACGAGCAGGCAACGGTGTGGAAAGTTCTTGAAGACGATGAATCAAACCAC 
TGCTTAGCTTTCATGTTCGTGAATTGGTCGTTCGTTTGA 

>G391 Amino Acid Sequence (domain in AA coordinates: 25-85) 

MMMVHSMSRDMMNRESPDKGLDSGKYVRYTPEQ 

ILSNIEPKQIKVWFQNRRCREKQRKELAARLQTVira 

ENGHMKHQLHTASGTTTDNS CESVVVSGQQHQQQNPNPQHQQRDANNP AGLLS IAEEAIiA 

EFLSKATGTAVDWVQMIGMKPGPDSIGIVAISimCSGIAARACGLVSIiEPMKVAEILKDR 

PSWLRDC^SVDTLSVIPAGNGGTIELIYTQMYAPTTIiAAARDFWTLRYSTC^ 

ERSLTSATGGPTGPPSSNFVRAEMKPSGFLIRPCDGGGSILHIVDHVDLDAWSVPEVl^P 

LYESSKIIjAQKMTVAALRHVRQIAQETSGEVQYGGGRQPA 

VDDGWSPMGSDGAEDVTVMimSPGKFGGSQYGNSFLPSFGSGVL^ 

LTOFLREHRSEWADYGVDAYAAASLRASPFAVPCARAGGFPSNQVILPIiAQTVE 

VVRLEGHAYSPEDMGLARDMYIJjQIjCSGVDENVVGG<^ 

RI I PLEQKSTPNGASANRTLDLASALEGSTRQAGEADPNGCNFRS VIiTIAFQ 

DSVASMARQYVRSIVGSIQRVALAIAPRPGSNISPISVPTSPEALTIiVRWISRSYSLOT 

ADLFGSDSQTSGDTLIJIQLWimSDAILCCSLK^ 

MLDKTIJ3EPGRKAL,CSEFPKIMQQGYAHLPAGVCASSMGRW 

CLAFMFVNWSFV* 

>G438 (188.. 2716) 

CGGGGTACCCAAGCCACGACCGTAGAATCTTCTTTTC 

TTTCTCTTACGATACGACGGACTTTCCGAAGAAATTAATTTAAAGAGAAAAGAAGAAGAA 

GCCAAAGAAGAAGAAiGAAGCTAGAAGAAACAGTAAAGTTTGAGACTTTTTTTGAG 

AGCTAAAATGGAGATGGCGGTGGCTAACCACCGTGAGAGAAGCAGTGACAGTATGAATAG 

ACATTTAGATAGTAGCGGTAAGTACGTTAGGTACACAGCTGAGCAAGTCXjAGGCTCTTGA 

GCGTGTCTACGCTGAGTGTCCTAAGCCTAGCTCTCTCCGTCGACAACAATTGATCCGTGA 

ATGTTCC^TTTTGGCCAATATTGAGCC^^ 

GTGTCGAGATAAGCAGAGGAAAGAGGCGTCGAGGCTCCAGAGCGTAAACCGGAAGCTCTC 
TGCGATGAATAAACTGTTGATGGAGGAGAATGATAGGTTGCAGAAGCAGGTTTCTCAGCT 
TGTCTGCGAAAATGGATATATGAAACAGCAGCTAACTACTGTTGTTAACGATCCAAGCTG 
TGAATCTGTGGTCACAACTCCTCAGCATTCGCTTAGAGATGCGAATAGTCCTGCTGGATT 
GCTCTCAATCGCAGAGGAGACTTTGGCAGAGTTCCTATCCAAGGCTACAGGAACTGCTGT 
TGATTGGGTTCAGATGCCTGGGATGAAGCCTGGTCCGGATTCGGTTGGCATCTTTGCCAT 
TTCGCAAAGATGCAATGGAGTGGCAGCTCGAGCCTGTGGTCTTGTTAGCTTAGAACCTAT 
GAAGATTGCAGAGATCCTCAAAGATCGGCCATCTTGGTTCCGTGACTGTAGGAGCCTTGA 
AGTTTTC^CTATGTTCCCGGCTGGT^^ 

GTATGCACCAACGACTCTGGCTCCTGCCCGCGATTTCTGGACCCTGAGATACACAACGAG 
CCTCGACAATGGGAGTTTTGTGGTTTGTGAGAGGTCGCTATCTGGCTCTGGAGCTGGGCC 
TAATGCTGCTTCAGCT^CT(^GTTTGT^ 

AAGGCCTTGTGATGGTGGTGGTTCTATTATTCACATTGTCGATCACCT 
TTGGAGTGTTCCGGATGTGCTTCGACCCCTTTATGAGTCATCCAAAGTCGTTGCACAAAA 
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AATCACCATTTCCGCGTTGCGGTATATCAGGCAATTAGCCCAAGAGTCTAATGGTGAAGT 
AGTGTATGGATTAGGAAGGCAGCCTGCTGTTCTTAGAACCTTTAGCQ\AAGATTAAGCAG 
GGGCTTCAATGATGCGGTTAATGGGTTTGGTGACGACGGGTGGTCTACGATGCATTGTGA 
TGGAGCGGAAGATATTATCGTTGCTATTAACTCTACAAAGCATTTGAATAATATTTCTAA 
TTCTCTTTCGTTCCTTGGAGGCGTGCTCTGTGCCAAGGCTTCAATGCTTCTCCAAAATGT 
TCCTCCTGCGGTTTTGATCCGGTTCCTTAGAGAGCATCGATCTGAGTGGGCTGATTTCAA 
TGTTGATGCATATTCCGCTGCTACACTTAAAGCTGGTAGCTTTGCTTATCCGGGAATGAG 
ACC^^CAAGATTCACTGGGAGTCAGATCATAATGCCACTAGGACATACAATTGAACACGA 
AGAAATG CTAGAAGTTGTTAG ACTGG AAGGTCATTCTCTTGCTCAAGAAGATG C ATTTAT 
GTCACGGGATGTCCATCTCCTTCAGATTTGTACCGGGATTGACGAGAATGCCGTTGGAGC 
TTGTTCTGAACTGATATTTGCTCCGATTAATGAGATGTTCCCGGATGATGCTCCACTTGT 
TCCCTCTGGATTCCGAGTCATACCCGTTGATGCTAAAACGGGAGATGTACAAGATCTGTT 
AACCGCTAATC7VCCGTACACTAGACTTAACTTCTAGCCTTGAAGTCGGTCCATCACCTGA 
GAATGCTTCTGGAAACTCTTTTTCTAGCTCAAGCTCGAGATGTATTCTCACTATCGCGTT 
TCAATTCCCTTTTGAAAACAACTTGCAAGAAAATGTTGCTGGTATGGCTTGTCAGTATG^ 
GAGGAGCGTGATCTCATCAGTTCAACGTGTTGCAATGGCGATCTCACCGTCTGGGATAAG 
CCCGAGTCTGGGCTCCAAATTGTCCCCAGGATCTCCTGAAGCTGTTACTCTTGCTCAGTG 
GATCTCTCAAAGTTACAGTCATCACTTAGGCrC^ 

AAGCGACGACTCGGTACTAAAACTTCTATGGGATCACCAAGATGCCATCCTGTGTTGCTC 
ATTAAAGCCACAGCCAGTGTTCATGTTTGCGAACCAAGCTGGTCTAGACATGCTAGAGAC 
AACACTTGTAGCCTTACAAGATATAACACTCGAAAAGATATTCGATGAATCGGGTCGTAA 
GGCTATCTGTTCGGACTTCGCCAAGCTAATGCAAGAG^ 

AATCTGTGTGTCAACGATGGGAAGACATGTGAGTTATGAACAAGCTGTTGCTTGGAAAGT 
GTTTGCTGCATCTGAAGAAAACAACAACAATCTGCATTGTCTTGCCTTCTCCTTTGTAAA 
CTGGTCTTTTGTGTGATTCGATTGACAGAAAAAGACTAATTTAAATTTACGTTAGAGAAC 
TCAAATTTTTGGTTGTTGTTTAGGTGTCTCTGTTTTGTTTTTTAAAATTATTTTGATCAA 
A 

>G438 Amino Acid Sequence (domain in AA coordinates: 22-85) 

^mVANHRERSSDSMNRHLDSSGKYWYTAEQVEAL 

ILANIEPKQ I KVWFQNRRCRDKQRKEASRLQSVlTRKIiSAMNK^ 

ENG YMKQQLTTVVNDPS CE S VVTTPQHSLRDANS PAGLLS IAEETLAEFLS KATGTAVDW 
VQMPGMKPGPDS VGI FAI S QRCNGVAARACGLVSLEPMKI AE ILKDRPS WFRDCRSLEVF 
TMFPAGNGGTIELVYMQTYAPTTLAPARDFWTLRY 
ASASQFVRAEMLSSGYLIRPCDGGGSIIHIVDHLm 

ISALRYIRQIiAQESNGEVVYGLGRQPAVIiRTFSQRLSRGFNDAVNGFGDDGWSTMHCDGA 
EDI IVAINSTKHIiNNISNSLS FLGGVLCAKASMLLQNVPPAVLIRFLREHRSEWADFNVD 
AYSAATLKAGSFAYPGMRPTRFTGSQIIMPLGHTIEHEEMLISrW 

DVHLLQICTGIDENAVGACSELIFAPINEMFPDDAPLVPSGFRVIPVDAKTGDVQDLLTA 
NHRTLDLTSSLEVGPSPENASGNS FSSS S SRCILTIAFQFPFENNIjQENVAGMACQYVRS 

vissvqrvamaispsgispslgsklspgspeavtlaqwisqsyshhlgselltidslgsd 

dsvlkllwdhqdaj^ccslkpqpvfmfanqag]^^ 

csdfaklmqqgfaclpsgicvstmgrhvsyeqavawkvfaaseel^^ 

FV* 

>G47 (38.. 472) 

CTTCTTCTTCACATCGATCATCATACAACAAGAAAAAATGGATTACAGAGAATCCACCGG 

TGAAAGTCAGTCAAAGTACAAAGGAATCCGTCGTCGGAAATGGGGCAAATGGGTATCAGA 

GATTAGAGTTCCGGGAACTCGTGACCGTCTCTGGTTAGGTTCATTCTCAACAGCAGAAGG 

TGCCGCCGTAGCACACGACGTTGCTTTCTTCTGTTTACACCAACCTGATTCTTTAGAATC 

TCTCAATTTCCCTCATTTGCTTAATCCTTC^CTCGTTTCCAGAACTTCTCCGAGATCTAT 

CCAGCAAGCTGCTTCTAACGCCGGCATGGCCATTGACGCCGGAATCGTCCACAGTACCAG 

CGTGAACTCTGGATGCGGAGATACGACGACGTATTACGAGAATGGAGCTGATCAAGTGGA 

GCCGTTGAATATTTCAGTGTATGATTATCTGGGCGGCCACGATCACGTTTGATTTATCTC 

GACGGTCATGATCACGTTTGATCTTCTTTTGAGTAAGATTTTGTACC^TAATGAAAACAG 

GTGTGGTGCTAAAATCTTACTCAAAACAAGATTAGGTACC^ 

TTGTGAATATACATTATAAGGTTTTGATTAATGTTTGTTTCACTGA 

gtccattgtatagaaatctattcaagaaacctagc^ 
attgagatttttaagtattcgtaatat^^ 
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AAAAA 

>G47 Amino Acid Sequence (domain in AA coordinates: 11-80) 
MDYRESTGESQSKYKGIRRRKWGKVWSEIRVPGTRD^ 

HQPDSIiESLNFPHLLNPSLVSRTSPRSIQQAASNAGMAIDAGIVHSTSVNSGCGDTTTYY 
ENG ADQVE PLN I S VYD YLGGHDHV * 
>G559 (89. .1285) 

aaagttgctagctttaatttgccaacttactattcttatgtgtaataatcgtttgcaggg 
tcgttgatttggtgataagtcagtagaaATGgataaggagaaatctccagcacctccttg 
tggaggtcttcctcctccatctccatcaggtcgatgctctgcattctcagaagctggtcc 
cattggtcatggttcagatgctaatcgaatgagtcatgatattagccgtatgcttgataa 
cccacctaagaagattggacatcggcgagctcattctgaaatacttactctccctgatga 
tttgagctttgatagtgatcttggtgtggttggtaatgctgctgatggagcttctttctc 
tgatgagactgaagaagatttgctctctatgtatcttgatatggataagtttaattcttc 
tgctacatcttctgcccaagttggtgagccatcaggaactgcttggaaaaatgagacaat 
gatgcagacaggcacaggctcaacttccaatcctcagaatacggttaatagtcttggcga 
aaggccaagaatcaggcatcaacatagccaatctatggatggttcaatgaatatcaatga 
gatgcttatgtcgggaaatgaagatgattctgctattgatgctaagaagtctatgtctgc 
tactaaacttgctgagcttgctctcattgatcctaaacgtgctaagaggatatgggcaaa 
caggcagtccgcagcacgatcaaaagaaaggaagacgagatacatatttgagcttgagag 
aaaagtacagactttgcaaacagaggctacaactctctcagcccagttgaccctcttaca 
gagagacacaaatggcttgactgttgaaaacaatgagctgaagctgcggttacaaacaat 
ggagcagcaggttcacttgcaggatgaactaaacgaagcactaaaggaggaaatccagca 
tctgaaggtgttgactggccaagttgctccatcagcgttgaactatgggtcgtttggatc 
aaaccagcagcaattctattccaacaatcagtcaatgcaaacaatcttagctgcaaaaca 
gttccagcaacttcagattcattcacagaagcagcaacaacaacaacaacaacaacaaca 
gcaacaccaacagcagcagcagcaacagcaacagtatcagtttcaacagcaacagatgca 
acagcttatgcagcagcggcttcaacagcaagaacaacaaaatggagtaagactcaagcc 
ttcacaagcccagaaagagaacTGAggaatatgaatatgtcccacgtaagtgagaggttc 
tccttctgaacaattcctttctcattcataaattgttgttcatccatcacttgcagtctc 
ttggattttagggttttagctaacaca 

>G559 Amino Acid Sequence (domain in AA coordinates: 203-264) 
MDKEKSPAPPCGGIjPPPSPSGRCSAPSEAGPIGHGSDANRMSHDISRMLDNPPKKIGHRR 
AHSEILTLPDDIiSFDSD^WGNAAIXSASFSDET^ 
PSGTAWKNETMMQTGTGSTSNPQimmSIjGE 

S AIDAKKSMS ATKEjAELAL I DPKRAKRIWANRQS AARSKERKTRYI FELERKVQTLQTEA 
TTLSAQLTLLQRDTNGLTVEJTbTELKLRLQTMEQQVHLQDELNR 

PSALl^GSFGSNQQQFYSNNQSMQTILAAKQFQQLQIHSQKQQQQQQQQQQQHQQQQQQQ 

QQYQFQQQQMQQLMQQRLQQQEQQNGVRLKPSQAQKEN* 

>G568 (141.. 995) 

GACCGGCTAAAGTCAAGAACCTCTCTCTGAGCTCTCACCACTXTCTCTCTCTACTCCCTC 

TCTGCGTGTAGGATACTACTAGACAATTGACAACCAAAGACTAAAGCTGTGTTGTTGGTT 

CACTTCTGTTCTCTTTTCCAATGTTGTCATCAGCTAAGCATCAGAGAAACCATAGACTCT 

CTGCTACAAACAAGAACCT^GACTCTCACCAAAGTTTCTTCCATTTCATCCTCATCACCAT 

CGTCTTCTTCTTCATCATCATCAACCTCATCATGATCTCCTTTACCTTCTCAAGACTCTC 

AAGCCCAGAAGAGATCTCTTGTCACCATGGAAGAAGTTTGGAATGACATCAACCTTGCTT 

CCATCCACCACCTAAACCGACACAGCCCTCATCCACAACAC^UVCCACGAGCCAAGGTTCA 

GGGGCCAAAACCACCACAACCAAAACCCTAACTCAATCTTCC^AGATTTTCTCAAAGGAT 

CnTTGAACCAGGAACCAGCACCCACAAGCCAGACCACGGGTTCTGCGCCTAATGGCGATT 

CCACCACGGTCACTGTTCTTTACAGCTCTCCTrTTCCACCTCCTGCAACTGTTCTGAGCT 

TGAATTCCGGCGCTGGCTTCGAGTTTCTCGATAACCA^^ 

CTAATCTTCATACCCACCATCACCTCTCA 

CTCTGGTTCC^TCCAGTTCTTTTGGTAAGAAAAGAGGCCAAGATTCCAATGAAGGTTCAG 
GGAATAGAAGACATAAGCGTATGATCAAGAACAGAGAATCTGCAGCTCGTTCCCGCGCTA 
GGAAACAGGCTTATACAAACGAGTTAGAACTTGAAGTTGCTCACTTGCAGGCAGAAAATG 
CAAGACTCAAGAGACAACAAGATCAAAAAATGGCTGCAGCAATTC^ 

ACACACTTCAACGGTCTTCCACAGCTCCATTTTGAGAAATCTACAAGTCCTTGTTTCTCT 
TTTGGGGATTGAGATTGTCTCATGAAGAAGTGAAAAAATGGCAAAAGTTTGTACCCTTTT 
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TTATTAGCTATAA.GTATAA.CTAAGCCTAAAATTGTAGAACTAAGATATTGTAGGGGAAAA 
AAGAAGATGTAAAACAAAAGACCCGGAAAGAGAAAAGGATCTTTCAATTTCCTAAGGCAC 
AGGAACACCTGTCCTGGGTCCTCTCTTAATGTTCTGTCGTTTTCOTATGCAAACCCTTTT 
TTCACTTCTGTACTAACTTATACTTGTATTCTTG 

>G568 Amino Acid Sequence (domain in AA coordinates: 215-265) 
MLSSAKHQRimRIiSATNK^^ 

VTMEE VWND INLAS IHHLNRHS PHPQHNHEPRFRGQNHHNQNPNS IFQDFBKGSLNQEPA 
PTSQTTGSAPNGDSTTVTVLYS S PFPPPATVLSLNSGAGFEFLDNQDPIjVTSNSNIiHTHH 
HLSNAHAFNTSFEALVPSSSFGKKRGQDSNEGSGRR^ 
ELELEVAHLQAENAMjKRQQDQKMAAAIQQPKKOTLQRSSTAPF* 
>G580 (43.. 747) 

CCAAAAAACAAAGCATTCTATGCTAOTCTGTTCT^ 

C^TAATAAGATCAACAACCATAGTGCCTTTTCAATTTCCTCTTC^TCATC^ 

ACATGATCCTCCCTAGGCCATAACAAATCTCAAGT 

ATCAACCTTGGTTCACTTGACTACCATCGGCAACTAAACATTGGTC^ 

AAGAACCAAAACCCTAATAACTCCATCTTTCAAGATTTCCTCAACATGCCT 

CCACC^CCACC^CC^CCACCACCTTC^ 

CTGCCTCTTCCGCCTCCTGCCACTGTCCTCAGCTTAAACTCCGGTGTTGGATTCGA 

CTTGATAC(^CAGAAAATCTTCTTGCTTCTAACCCTCGCTCCTTTGAGGAATCTGCAAAG 

TTTGGTTGTCTTGGTAAGAAAAGAGGCCAAGATTCTGATGATACTAGAGGAGACAGAAGG 

TATAAGCGTATG AT CAAGAACAGAGAATC TGCTGCTCGTTCAAGGGCTAGGAAGCAGGCA 

TATACAAACGAACTTGAGCTTGAAATTGCTCACTTGCAGACAGAGAATGGAAG 

ATACAACAAGAGCAGCTGAAAATAGCCGAAGCAACTCA 

CAACGGTCTTCCACAGCTCCATTTTGAGAAAAATCTACTATTTCTTTTTGGGGGAGTITC 
AAGTGTTTCTTATGAAGATGAGAAAAACAGAAAAAGTTTGTACA^'XIUAGCTAAGTTAAA 
TTTGTGGTGGTAAGTAATGTAAAAGAAAAGTGTGTGTAGAAGAAAAGTGTCTAGAAAAAG 
AAAGCAACTAACTTTCTTCTTCTTCTCTGGTTTCCTATCAACTCTTTTGACTTTTGTACT 
TTTTTTCTTCTCTACTTAACCTCTATTATTGTAATGCCAAGTCAAGTCCTTATCTAGCTA 
GTACATGAGTTTCTGTTTTCACTGGTTAAGCCAT 

>G580 Amino Acid Sequence (domain in AA coordinates : 162-218) 
IVfrSSAKHl^imHSAFSISSSSSSLSTSSSLGHN^ 

GHEPMLKNQNPNNSIFQDFIjI^PIiNQPPPPPPPPSSSTIVTAIjYGSIjPIjPPPATVLSIjNS 
GVGFEFLDTTENLIiASNPRSFEESAICFGCIiGKKRGQDSDDTRGDRRYKRMIKlTRESAARS 
RARKQAYTNELELE IAHLQTENARIiKI QQEQLKIAEATQNQVKKTLQRS STAPF* 
>G615 (197.. 1252) 

TTTTTTCTTTTCTTTCTT'iU^rTGCTGGTGTGAGAAATTGTACGCTTACTATCTCTCTC^ 
CTCTCTGCCAGATTCTCTCTTTTTGATGATGTGAAAGTTGTGCTTTTGTTTCTTAAGAAA 
AAGGCATATTTTTAATACTTGATTCTTGGTTCTTGATTCTTGATTCTTGGTTTTTTTTAG 
CTTCTTAAGTTCGGTGATGTCGTCTTCCACCAATGACTACAACGATGGTAATAACAATGG 
AGTGTACCCTCTCTCTCTTTACCTTTCTTCACTCTCTGGCCATCAAGACATCATTCATAA 
TCCCTACAACCATCAGTTAAAAGCATCTC 

TCTGATCGATTACATGGCGTTTAAGTCAAATAATGTTGTGAATCAACAAGGCTTTGAGTT 
TCCTGAGGTGTCAAAGGAAATGAAGAAGGTGGTGAAGAAGGACCGACATAGCAAGATTCA 
AACGGCACAAGGGATTAGAGACAGGAGGGTTAGGCTTTTTATTGGGATTGCTCGCCAATT 
CTTTGATCTTCAGGATATGTTGGGGTTTGATAAAGCTAGTAAAACGTTAGACTGGCTGCT 
CAAGAAGTCAAGAAAAGCCATCAAAGAGGTCGTAGAAGCAAAA^ 

TGAAGATTTTGGAAACATTGGAGGCGATGTAGAACAAGAAGAGGAGAAGGAGGAGGATGA 
CAATGGCGATAAGAGCTTCGTGTATGGTTTGAGCCCCGGGTACGGTGAAGAAGAAGTGGT 
ATGTGAGGCCACGAAGGCAGGGATAAGAAAGAAGAAGAGTGAGTTGAGAAACATCTCATC 
AAAGGGGCTAGGAGCCAAAGCTAGAGGAAAAGCAAAGGAGCGAACAAAAGAGATGATGGC 
CTATGATAATCCAGAGACTGCCTCTGATATTACACAATCTGAAATCATGGACCCATTCAA 
GAGGTCTATAGTCTTCAATGAAGGAGAAGATATGACACACCTTTTCTACAAGGAACC^T 
CGAGGAGTTTGATAATCAAGAATCTATCTTAACCAATATGACTCTACCAACGAAGATGGG 
TCAAAGTTAGAATCAAAATAATGGGATACTTATGTTGGTAGATCAGAGTTCTAGCAGCAA 
CTATAATACATTTCTGCCTCAAAATTTGGATTA^ 

CCAAACCTTATATGTAGTCACCGACAAAAATTTCCCCAAAGGTTTCCTATAAATCTCGAC 
AGTTTTGAAGGACTATGCATGATCAAGTTTAAACATGTAAGCCAATATAGTCCCTTATTC 
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CTCTGAATGTATACAAAATCTATAGTTATGTATATCTGTTCCTTTTTAACGTATCTTTAT 
TGATCTTCTGTGCCTTGATCAAAATTGTGATTTT^ 

CTACAACTTTTAAGTGGTATTATTGTAACCTTTTGAACTATATATTTTGAAGATGAATAA 
GAACATGTTTATATAAAAA 

>G615 Amino Acid Sequence (domain in AA coordinates : 88-147) 
MSSSTNDYm)GNNNGVYPLSI/raSSLSGHQDI^ 
AFKSNNVVNQQGFEFPEVSKEIKKWKKDRHSKIQT^^ 
MLGFDKASKTliDWLLKKSRKAIKEWQAI<^^ 

FVYGLS PGY GEEEWCEATKAG I RKKKS ELRN I S S KGLGAKARGKAKERTKEMMAYDNPE 
TASDITQSEIMDPFKRS IVFNEGEDMTHLFYKEPIEEFDNQESILTNMTLPTKMGQSYNQ 
NNGILMLVDQSSSSNYOTFLPQlsn^^ 
>G732 (73-. 588) 

AAAAAAACCAAACATA?^AACATAAAACTCTGTCCTTTTTTTG 

TGTTAAAAATCAATGGCGTCATCTAGCAGCACATACCGGAGCTCAAGCTCTTCCGACGGT 

GGTAATAATAACCCGTCGGACTCCGTCGTCACCGTCGACGAACGAAAACGTAAAAGAATG 

TTATCGAAC^GAGAATCTGCACGTAGGTCAAGGATGCGTAAACAGAAACACGTTGATGAT 

CTAACGGCTCAGATCAATCAGCTATCAA 

GTAACATCTCAGCTTTACATGAAGATCCAA 

GAGGAGCTTAGCACCAGACTCCAATCTCTCAACGAGATC 

GGTGCAGGATTTGGTGTTGACCZAGATCGACGGCTGTGGTTTTGATGATCGTACGGTTGGG 
ATCGACGGATATTACGATGATATGAATATGATGAGTAATGTTAATCATTGGGGTGGTTCG 
GTTTACACTAACCAACCCATTATGGCTAATGATATC^TATGTATTGATTAATAAAATTA 
ATTAAAATAATTAGATGCCCCTTTTTTGTCT 

GTTTTTGGGTTGGTGTGATGATGTAATTATAGTACATGCATCTTTGATTGGTTGGAAGGA 

TAAATATAAACTTTATATATATATTGGGGCATATATATATGAGTTGTACTTTGCATGTAT 

TGGTGTGTGTTTTGTTATAATTATATGATTATATATGTTTATGTTAAAAAAAAA 

>G732 Amino Acid Sequence (domain in AA coordinates: 31-91) 

MASSSSTYRSSSSSDGGRTNNPSDSVVTVDERK 

INQLSNDNRQILNSLTWSQLYMKIQAEN^ 

GVDQIDGCGFDDRTVGIDGYYDDMNl^SNVNHWGGSV^ 

>G988 (1...1338) 

ATGCTTACTTCCTTCAAATCCTCTAGCTCCTCCTCCGAAGATGCCACCGCTACCACCACC 
GAGAATCCTCCTCCTTTGTGCATCGCCTCCTCCTCGGCCGCAACCTCCGCCTCACATCAC 
CTCCGTCGTCTTCTTTTCACCGCTGCGAATTTCGTCTCCCAGTCAAACTTCACCGCCGCT 
CAAAACTTACTCTCAATCCTCTCCCTTAACTCTTCTCCTCACGGCGACTCCACCGAGCGA 
CTTGTACACCTCTTCACTAAAGCCTTGTCCGTACGAATC^CCGTCAGCAACAAGATCAG 
ACGGCTGAAACGGTTGCCACGTGGACGACGAACGAAATGACGATGAGTAACTCCACGGTG 
TTCACGAGCAGTGTATGGAAAGAACAGTTCTTC 

TTCGAGTCTTGTTACTATCTTTGGCTAAACCAACTAACGCCGTTTATTCGGTTCGGTCAT 
TTAACGGCGAACCAAGCTATCCTCGACGCGACGGAGACAAACGATAACGGAGCTCTACAT 
ATACTTGATTTAGATATATCACAAGGACTTCAATGGCCTCCATTGATGCAAGCCCTAGCA 
GAGAGGTCATCAAACCCTAGCAGTCCACCTCCATCTCTCCGCATAACCGGATGCGGTCGA 
GATGTAACCGGATTAAACCGAACTGGAGACCGGTTAACCCGGTTCGCTGACTCTTTAGGT 
CTCCAATTCCAGTTTCACACGCTAGTGATCGTAGAAGAAGATCTCGCCGGACTTTTGCTA 
CAGATCCGATTGTTAGCTCTCTCAGCCGTAC^GGAGAGACCATTGCCGTCAATTGTGTT 
CACTTCCTCCACAAAATATTTAACGACGATGGAGATATGATCGGTCACTTCTTGTCAGCG 
ATCAAGAGCTTAAACTCTAGAATCGTTACAATGGCAGAGAGAGAAGCTAATCATGGAGAT 
CACTCGTTCTTGAATAGATTCTCTGAGGCAGTGGATCATTACATGGCGATCT?TTGATTCG 
TTGGAAGCGACGTTGCCGCCAAATAGCCGAGAGAGACTAACCCTAGAGCAACGGTGGTTC 
GGTAAGGAGATTTTGGATGTTGTGGCGGCGGAAGAGACGGAGAGAAAGCAAAGACATCGG 
AGGTTTGAGATTTGGGAAGAGATGATGAAGAGGTTTGGTTTCGTTAACGTTCCTATTGGA 
AGCTTTGCTTTGTCTCAAGCTAAGCTTCTTCTTAGACTTC^TTATCCTTCAGAAGGTTAT 
AATCTTCAGTTCCTTAACAATTCTTTGTTTCTTGGCTGGCAAAATCGTCCCCTCTTCTCC 
GTTTCGTCGTGGAAATGA 

>G988 Amino Acid Sequence (domain in AA coordinates : 178-195) 
MLTSFKSS S SS SEDATATTTENPPPLC IASS SAATSASHHLRRLLiFTAANFVSQSNFTAA 
QNLLSILSLNSSPHGDSTERIjVHIjFTKAIjSTO 
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FTSSVCKEQFLFRTKNl^SDFESCYYIjWIiNQIiTPFIRFGHLTANQAILDATETiroNGAIiH 
IIJDLDISQGLQWPPLMQAIAERSSNPSSPPPSLRITGCG^ 

IiQFQFHTLVIVEEDLAGLLLQIRLIiALSAVQGETIAVNCraFLHKIFNDDGDMIGHFLSA 

IKSLNSRIVTMAEREANHGDHSFIjNRFSEAVDHYMAIFDSLEATLPPNSRERLTLEQRWF 

GKEILDWAAEETERKQRHRRFEIWEBMMKRFGFVNTO 

KTLQFLNNSLFLGWQNRPLFSVSSWK* 

>G1519 (1..1146) 

ATGAGGCTTAATGGGGATTCGGGTCCGGGTCAGGATGAACCCGGTTCGAGCGGGTTTCAC 

GGCGGAATC^GACGATTCCCGTTAGCAGCTCAGCCGGAGATTATGAGAGCTGCTGAGAAA 

GACGATC^TACGCTTCTTTCATCC^CGAAGCTTGCCGCGATGCCTTCCGAC^CCTTTTC 

GGTACAAGAATCGCTCTTGCTTACCAGAAGGAGATGAAGCTACTTGGACAGATGCTTTAC 

TATGTTCTTACGACAGGTTCAGGGCAACAAACTTTAGGAGAGGAATATTGTGACATTATA 

CAGGTTGC^GGGCCTTATGGACTCTCTCCTACACCAGCTAGACGTGCTTTGTTC^TATTG 

TACCAGACCGCAGTTCCATATATCGCAGAGAGAATTAGCACTCGAGCTGCTACGCAAGCA 

GTCACCTTTGATGAGTCTGATGAGTTTTTTGGTGATAGTCATATCCACTCACCAAGAATG 

ATAGATCTTCCATCTTCATCTCAAGTTGAAACTTCAACTTCTGTAGTATCTAGGTTAAAC 

GATAGACTTATGAGATCGTGGCACCGAGCTATTCAGCGATGGCCTGTGGTTCTTCCTGTT 

GCCCGCGAAGTCTTACAACTGGTTTTGCGTGCCAATCTGATGCTCTTCTACTTTGAAGGT 

TTTTATTATCATATATCGAAACGTGCATCCGGGGTTCGTTATGTTTTCATAGGAAAGCA^ 

CTGAATCAGAGACCTAGATACCAAATTCTTGGGGTTTTCCTTCTAATCCAATTGTGCAT 

CTTGCTGCTGAGGGCTTGCGTCGGAGTAATTTGTCATCTATCAC^AGCTCCATTCAGCAG 

GCTTCTATAGGATCTTATCAAACTTCAGGAGGGAGAGGTTTACCTGTTTTAAATGAAGAG 

GGGAATTTGATAACTTCGGAAGCTGAAAAGGGAAACTGGTCTACCTCCGATTCAACTTCA 

ACGGAGGCAGTAGGGAAATGGACTCTCTGCTTAAGCACCCGTCAGCACCCAACGGCCACT 

CCTTGTGGTCATGTGTTTTGTTGGAGCTGCATTATGGAATGGTGCAACGAGAAGCAAGAA 

TGCCCTCTTTGTCGAACGCCCAATACCCATTCAAGTTTGGTTTGTTTGTATCATTCTGAT 

TTTTAG 

>G1519 Amino Acid Sequence (domain in AA coordinates: 327-364) 
MRLNGDSGPGQDEPGSSGFHGGIRI^PIiAAQPEIMRAAEKDDQYASFIHEACRDAFRHIjF 
GTRI ALAYQKEMKLLGQMLYYVIiTTGSGQQTLGEEYCDI IQVAGP YGLS PTPARRALF IIi 
YQTAVPYIAERISTRAATQAVTFDESDEFFGDSHIHSPRMIDLPSSSQVETSTSWSRLN 
DRLMRSWHRAIQRWPWLPVAREVLQL^ 

LNQRPRYQILGVFLLIQLCILAAEGLRRSNLSS ITSS IQQAS IGS YQTSGGRGLPVIiNEE 
GNLITSEAEKGl^STSDSTSTEAVGKCTLCLSTRQHPTATPCGHVTCWSCIMEWQIEKQE 
CPLCRTPNTHSSLVCLYHSDF* 
>G374 (1..1359) 

ATGGACAACAAAAATGATCAGGATATTGATGTTAGATCAGTGGTTGAAGCTGTTTCCGCC 
GATCTTTCCTTTGGTGCTCCCCTCTATGTGGTTGAGAGCATGTGCATGCGCTGCCAAGAA 
AATGGAACAACCAGATTTCTATTGACCTTAATTCCTCACTTCAGAAAGGTCTTAATATCT 
GCATTTGAATGTCCGCATTGCGGGGAAAGGAATAATGAAGTTCAGTTCGCAGGCGAGATT 
CAACCCCGTGGATGCTGTTACAATCTAGAGGTTCTAGCTGGTGATGTGAAGATATTTGAC 
CGGCAAGTTGTGAAATCTGAATCAGCCACTATTAAGATT 

CCACCAGAGGCCCAACGTGGAAGTTTGTCTACTGTGGAAGGGATATTAGCACGGGCTGCT 
GATGAACTGAGTGCCCTTCAAGAAGAACGCAAGAAAGTTGATCCTAAAACTGCTGAAGCA 
ATAGACCAATTCTTGTCCAAACTGAGAGCTTGTGCT^ 

ATTTTGGATGATCCTGCTGGAAACAGTTTCATTGAGAACCCACATGCTCCATCACCAGAT 

CCCTCTCTAACCATCAAATTCTATGAGCGAACACCAGAGCAACAAGCAACACTTGGATAT 

GTTGCTAACCCATCTCAGGC^GGACAATCAGAAGGAAGCCTTGGCGCACC^ 

TTCCCTTCAACTTGCGGAGCATGTACGGAGCCGTGTGAGACACGGATGTTCAAAATAGAA 

ATCCCGTACTTTCAGGAAGTTATTGTCATGGCATCTACATGTGACAGTTGTGGCTATCG^ 

AATTCTGAGTTGAAGCCTGGTGGTGCAATTCCTGAAAAGGGAAAGAAGATTACTCTCTCT 

GTGAGGAAC^TTAGAGACCTTAGCCGAGATGTTATCAAGTCGGACACTGCAGGAGTGATA 

ATCCCAGAACTTGATCTGGAGCTAGCTGGTGGTACACTTGGTGGAATGGTAACAACAGTT 

GAAGGGTTGGTTACACAGATCAGAGAAAGCCTAGCGAGAGTTCACGGATTCACTTTTGGT 

GATAGTATGGAAGAGAGTAAGTTGAACAAATGGAGAGAATTTGGAGCCAGGCTCACTAAG 

CTCCTT^GCTTTGAACAGCCGTGGACATTGATTCTTGATGATGAATTAGCAAATTCCTT^ 

ATTGCACCAGTAACAGATGATATCAAAGATGACCATCAGCTCACATTTGAAGAGTACGAG 
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AGGTCATGGGATCAAAACGAGGAGTTGGGTCTCAACGACATAGATACTTCTTCAGCTGAT 
GCTGCTTATGAATCCACAGAGACGACTAAATTACCTTAA 

>G374 Amino Acid Sequence (domain in aa coordinates: 35-67, 245-277) 
MDNKOTQDIDTOSWEAVSADLSFGAPLYVVE^ 

AFECPHCGERl^EVQFAGEIQPRGCCyNIjEVLAGDVKIFDRQVVKSESATIKIPELDFEI 
PPEAQRGSLSTVEGILARAADELSALQEERKKTOPKTAE 

ILDDPAGNSFIENPHAPSPDPSLTIKFYERTPEQQATLGYVANPSQAGQSEGSLGAPVMT 

FPSTCGACTEPCETRMFKIEIPYFQEVIVMASTODSCGYRNSELKPGGAIPEKGKiaCTLS 

VRNITDLSRDVIKSDTAGVI I PEIiDLELAGGTLGGMVTTVEGLVTQIRESIjARVHGFTFG 

DSMEESKLNKWREFGARIiTKLLSFEQPWTLIIiDDELANSFIAPVT^ 

RSWDQNEELGIiNDIDTSSADAAYESTETTKLP* 

>G877 (397.. 2460) 

CAAAGATTAGACTAATCCGACTGTGTTTTTAATCAAT 

AGTTGTAAAGTTTTGAl'l U u lU u rr r i'CTGGGTTTTTTCTGTGAGAC C CAGAAGAAGAACAG 
AGAGAGGAAGAAGGAGAAGAAAAAAATATCH'CTTTCTCTCCGGCTTTCAACAAAATCTCT 
CTTTTTTCCTTCATCAGTGTTAAATTCGGATCCGGGTCGGGTGGGTTTTCGGTTTTTGGT 
GTTCGGATCAGAGCACAGTTGGATGTTAGCGACGGAACTGAGGATTTCAGTTTGCGGCTG 

TTTGATCAGAGATTC^GCCAAATTCTTGGATACTAAATGGCTGGTTTTGATGAAAATGOT 
GCTGTGATGGGAGAATGGGTGCCTCGTAGTCCTAGTCCCGGGACACTTTTCTCCTC?TGCT 
ATTGGAGAAGAGAAGAGCTCGAAACGTGTTCTTGAAAGAGAGTTATCTTTGAATCATGGT 
CAAGTTATTGGTTTAGAAGAAGACACTAGTAGTAATCATAAC^ 

AATGTTTTTCGAGGTGGTCTCAGTGAAAGAATTGCTGCAAGAGCTGGATTTAATGCTCCA 

AGGTTGAACACTGAGAATATCCGCACCAACACCGACTTTTCCATTGACTCTAACCTTCGA 

TCTCCTTGCTTAACC^TCTCTTCTCCTGGCCTTAGCCCTGCAACACTCTTGGAATCTCCT 

GTTTTCCTTTCTAACCCATTGGCTCAACCTTCTCCAACTACCGGGAAATTT^ 

CCTGGTGTTAATGGTAATGCATTGTCTTCTGAGAAAGCGA2\AGACGAGTTCTTTGATGAT 

ATTGGAGGATCATTCAGCTTCCATCCTGTTT 

ACAACAGAGATGATGTCAGTTGATTATGGTAACT^ 

TCCGCAGAAGAAGTAAAACCTGGCTCTGAAAACATAGAAAGCTCCAATCTTTATGGGATT 

GAAACTGACAATCAAAACGGGCAGAACAAGACATCTGATGTCACTACAAAC^CCAGTCTT 

GAAACCGTGGATCATCAAGAGGAAGAAGAAGAGCAAAGACGCGGTGATTCGATGGCTGGT 

GGTGCGCCTGCAGAGGATGGATATAACTGGAGGAAATACGGACAAAAGTTGGTCAAAGGA 

AGTGAGTATCCGCGAAGCTATTACAAGTGCACAAACCCGAATTGTCAGGTGAAGAAGAAA 

GTTGAGAGATC^GGGAAGGTCAC^TCACAGAGATTATATACAAAGGAGCTCATA^ 

CTTAAACCTCCACCTAATCGCCGCTCAGGGATGCAAGTAGATGGAACTGAACAAGTTGAA 

CAAGAACAACAACAGAGAGATTCTGCTGCAACGTGGGTT^ 

CAAGGTGGAAGCAATGAGAACAATGTCGAAGAGGGATCTACGAGATTCGAGTATGGAAAC 
CAATCTGGATCAATTCAAGCTCAAACCGGAGGTCAATACGAGTCAGGTGATCCTGTGGTT 
GTGGTTGATGCTTCTTCAACATTCTCTAATGATGAAGATGAAGATGATCGAGGGACACAT 
GGAAGTGTTTCTTTGGGTTACGATGGAGGAGGAGGAGGTGGGGGAGGAGAAGGAGATGAA 
TCAGAGTCGAAAAGAAGGAAACTAGAAGCTTTTGCAGCAGAGATGAGTGGATCAACAAGA 
GCCATACGTGAGCCAAGAGTTGTTGTGCAGACAACGAGTGATGTTGACATTCTTGATGAT 
GGTTATCGCTGGCGAAAATATGGTCAGAAAGTTGTCAAAGGCAATCCAAATCCT^AGGAGT 
TATTACAAATGCACAGCTCCAGGATGTACAGTGAGGAAA(^ 

GATCTCAAATCCGTTATAACAACTTACGAAGGCAAACATAACCATGACGTCCCCGCTGCA 

CGCAACAGCAGCCACGGAGGCGGTGGTGATAGTGGTAACGGTAACAGCGGCGGTTCAGCC 

GCAGTTTCTCACCAyTACCACAACGGTCATC^CTC^GAGCCGCCACGTGGGAGATTCGAC 

AGACAAGTCACAACTAACAATCAGTCTCCTTTTAGCCGTCCCTTTAGCTTTCAGCCACAT 

TTGGGTCCTCCTTCTGGTTTCTCCTTCGGTTTAGGACAAACCGGTTTGGTTAATCTTTCA 

ATGCCTGGTTTAGCGTATGGTCAAGGGAAAATGCCGGGTTTGCCTCACCCGTATATGACA 

CAACCGGTTGGGATGAGTGAAGCAATGATGCAGAGAGGGATGGAACCAAAGGTTGAACCG 

GTTTCAGATTCAGGACAATCGGTATATAACCAGATCATGAGTAGATTACCTCAGATTTGA 

AATTTACTCTTCTTCTTCTTCTTCTGCATTTGGTCACTCCTTATAATAACTTTTAATTTC 

TGCTTCTTCTTCTTCTTTCATTTATTGGTTTCAAACTTTGGGGAAGGTAA 

ATTGTTAAAAAAAAAAAAAAAAA 

>G877 Amino Acid Sequence (domain in AA coordinates: 272-328, 487-603) 



14 



* 



WO 03/013227 PCT/US02/25805 



MAGFDENVAVMGEWVPRS PS PGTLFSSAI GEEKS SKRVLERELSLNHGQVIGIjEEDTSSN 
HNKDSSQSNVFRGGLSERIAARAGFNAPRIiNTENIRTNTD 

PATLLESPVFLSNPLAQPSPTTGKFPFLPGVNGNALSSEKAKDEFFDDIGASFSFHPVSR 
SSSSFFQGTTEMMSVDYGNYNNRSSSHQSAEEVKPGSENIESSNIiYGIETDNQNGQNKTS 
DVTTNTSLETVDHQEEEEEQRRGDSMAGGAPAEDGYNWRKYGQKLVKGSEY^ 
PNCQVKKXVERSREGHITEIIYKGAHNHIjKPPPNRRSGMQVD 

VSCNNTQQQGGSNENNVEEGSTRFEYGNQSGSIQAQTGGQYESGDPVVVVDASSTFSNDE 
DEDDRGTHGSVSLGYDGGGGGGGGEGDESESKRRKLEAFAAEMSGSTRAIREPRVWQT^ 
SDVD I LDDGYRWRKYGQKVVKGNPNPRS YYKCTAPGCTVRICHVERASHDIjKS VITTYEGK 
HNHDVPAARNS SHGGGGDS GNGNSGGS AAVSHHYHNGHHSEPPRGRFDRQVTTNNQS PFS 
RPFSFQPHLGPPSGFSFGLGQTGLVNLSMPGIiAYGQGKMPGLPHPYMTQPVGMSEAMMQR 
GMEPKVEPVSDSGQSVYNQIMSRIiPQI * 
>G1000 (1..954) 

ATGGGAAGACCTCCTTGTTGTGACAAGTCCAATGTCAAGAAAGGTCTCTGGACCGAGGAA 
GAAGACGCTAAGATCCTTGCTTATGTTGCTATCCATGGTGTAGGAAACTGGAGCTTGATC 
CCCAAAAAAGCAGGTCTGAATCGATGTGGAAAGAGCTGTAGACTAAGATGGACTAATTAC 
TTAAGACCTGACCTTAAACATGACAGCTTCTCTACCCAAGAAGAAGAGCTTATCATTGAG 
TGTCATAGAGCCATTGGCAGCAGGTGGTCTTCGATTGC^ 

GATAATGATGTGAAGAATCACTGGAACACAAAGCTGAAGAAGAAGCTGATGAAAATGGGG 
ATAGACCCGGTGACTCATAAACCGGTTTCTGAACTCCTT^ 
GGCGATGGAAATGCATCCTTGAAZU\CAGAACCA^ 
AACTCAGCTTGGGAAATGATGAGAAACA.CAACAACAAA 

TCTCCAATGATGTTTACT^AATTCCTCTGAGTACCAAACTACTCCATTTCATTTCTATAGC 

CATCCAAATCATCTGCTCAATGGAACCACATCTTCATGCTCTTCCTG^ 

AGTATCACTCAGCGAAACCAAGTACCTCAAACACCGG 

TTCCTTCTCTCGGACCCGGTTCCTCAAGTAGTGGGATCCTCAGCTACTAGCGACCTCACT 
TTTACGCAGAACGAACATCATTTCAACATCGAAGCCGAATACATCTCTCAAAACATCGAT 
TCAAAGGCCTCGGGAACATGTCATTCCGCGAGTTCCTTCGTTGACGAAATACTAGATAAA 
GACCAAGAGATGTTGTC^CAGTTTCCTCAACTCTTGAATGATTTCGATTATTAG 
>G1000 Amino Acid Sequence (domain in AA coordinates: 14-117) 
MGRPPCCDKSNVTCKGLWTEEEDAKIIjAYVAIHGVGNWSLI PKKAGIiNRCGKS CRLRWTNY 
IiRPDLKHDSFSTQEEELI IECHRAIGSRWSS IARKLPGRTDNDVKiraWOTKIiKKKLMKMG 
IDPVTHKPVSQLLAEFRNISGHGNASFKTEPSl>mSIL^ 

SPMMFTNSSEYQTTPFHFYSHPNHLIjNGTTSSCSSSSSSTSITQPNQVPQTPVTNFYWSD 
FLLSDPVPQWGSSATSDLTFTQNEHHFNIEAEYISQNIDSKASGTCHSASSFVDEILDK 
DQEMLSQFPQLLNDFDY* 
>G1067 (436. .1371) 

TCTCAAGCTTCTCTCTCCTTTTTTTCCCATAGCACATCAGAATCGCTAAATACGACTCCT 

ATGGAAAGAAGAAGCTACTTCTTTCTCTTGCCCTAATTAATCTACCTAACTAGGGTTTCC 

TCTTACCTTTCATGAGAGAGATCATTTAAGATAAGTCACCTTTTTTATATCTTTTGCTTC 

GTCTTTAATTTAGTTCTGTTCTTGGTCTGTTTCTATATTTTGTCGGCTTGCGTAACCGAT 

CAGACCTTAATGCTTTAGCTATTGTTTCCTCAAAATCATGAGTTTTGACTTCTCGATCTG 

AGTTTTCTTTTTCTCTCTTTACGCTCTTCTTCACCTAGCTACCAATATATGAACGAGCAG 

GATCAAGAATCGAGAAATTGATTTGAGCTGGCGAATAAGCAGTGGTGGGATAGGGAATTA 

GTAGATGCGGCGGCGATGGAAGGCGGTTACGAGCAAGGCGGTGGAGCTTCTAGATACTTC 

CATAACCTCTTTAGACCGGAGATTCACCACCAACAGCTTCAACCGGAGGGCGGGATCAAT 

CTTATCGACCAGCATCATCATCAGCACGAGCAACAT 

GATTCAAGAGAATCIGACCATTCAAACAAAGATCATCATG 

GACCCGAATACATCAAGCTCAGCACCGGGAAAACGTCCACGTGGACGTCCACCAGGATCT 
AAGAACAAAGCCAAGCCACCGATCATAGTAACTCGTGATAGCCCCAACGCGCTTAGATCT 
CACGTTCTTGAAGTATCTCCTGGAGCTGACATAGTTGAGAGTGTTTCCACGTACGCTAGG 
AGGAGAGGGAGAGGCGTCTCCGTTTTAGGAGGAAACGGCACCGTATCTAACGTCACTCTC 
CGTCAGCCAGTCACTCCTGGAAATGGCGGTGGTGTGTCCGGAGGAGGAGGAGTTGTGACT 
TTACATGGAAGGTTTGAGATTCTTTCGCTAACGGGGACTGTTTTGCCACCTCCTGCACCG 
CCTGGTGCCGGTGGTTTGTCTATATTTTTAGCCGGAGGGCAAGGTCAGGTGGTCGGAGGA 
AGCGTTGTGGCTCCCCTTATTGGATCAGCTCCGGTTATACTAATGGCGGCTTCGTTCTCA 
AATGCGGTTTTCGAGAGACTACCGATTGAGGAGGAGGAAGAAGAAGGTGGTGGTGGCGGA 
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GGAGGAGGAGGAGGAGGGCCACCGCAGATGCAACAAGCTCCATCAGCATCTCCGCCGTCT 
GGAGTGACCGGTCAGGGACAGTTAGGAGGTAATGTGGGTGGTTATGGGTTTTCTGGTGAT 
CCTCATTTGCTTGGATGGGGAGCTGGAACACCTTCT^GACCACCTTTTTAATTGAATTTT 
AATGTCCGGAAATTTATGTGTTTTTATCATCTTGAGGAGTCGTCTTTCCTTTGGGATATT 
TGGTGTTTAATGTTTAGTTGATATGCATATTTT 

>G1067 Amino Acid Sequence (domain in AA coordinates: 86-93) 
MEGGYEQGGGASRYFHNLFRPEIHHQQLQPQGGINIilDQHHHQHQQHQQQQQPSDDSRES 
DHSNKDHHQQGRPDSDPOTSSSAPGKRPRGRPPGSKNKAK^ 

S PGAD I VESVSTYARRRGRGVS VLGGNGTVSNVTIiRQPVTPGNGGGVSGGGGVVTIiHGRF 
EILSLTGTVIiPPPAPPGAGGLSIFIAGGQGQWGGSVVAPL^ 

RiPIEEEEEEGGGGGGGGGGGPPQMQQAPSASPPSGVTGQGQLGGNVGGYGFSGDPHLLG 

WGAGTPSRPPF* 
>G1075 (19-. 876) 

TTTGTGTTTGGTGCTGGCATGGCTGGTCTCGATCTAGGCACAACTTCTC 
AACGTCGATGGTGGCGGCGGCGGACAGTTCACCACCGACAACCACCACGAAGATGACGGT 

GGCGCTGGAGGAAACCACCATCATCACCATCATAATQ 

TTAATAGCTTCTAATGATAACTCTGGACTAGGCGGCGGTGGAGGAGGAGGGAGCGGTGAC 

CTCGTCATGCGTCGGCCACGTGGCCGTCCAGCTGGATCGAAGAACZAAACCGAAGCCGCCG 

GTGATTGTCACGCGCGAGAGCGCAAACACrCTTAGGGCTC^CATTCTTGAAGTTGGAAGT 

GGCTGCGACGTTTTCGAATGTATCTCCACTTACGCTCGTCGGAGACAGCGCGGGATTTGC 

GTTTTATCCGGGACGGGAACCGTCACTAACGTCAGCATCCGTCAGCCTACGGCGGCCGGA 

GCTGTTGTGACTCTGCGGGGTACTTTTGAGATTCTTTCCCTCTCCGGATCTTTTCTTCCG 

CCACCTGCTCCTCCAGGGGCGACTAGCTTGACGATATTCCTCGCTGGAGCTCAAGGACAG 

GTCGTCGGAGGTAACGTAGTTGGTGAGTTAATGGCGGCGGGGCCGGTAATGGTCATGGCA 

GCGTCTTTTACAAACGTGGCTTACGAAAGGTTGCCTTTGGACGAGCATGAGGAGCACTTG 

CAAAGTGGCGGCGGCGGAGGTGGAGGGAATATGTACTCGGAAGCCACTGGCGGTGGCGGA 

GGGTTGCCTTTCTTTAATTTGCCGATGAGTATGCCTCAGATTGGAGTTGAAAGTTGGCAG 

GGGAATCACGCCGGCGCCGGTAGGGCTCCGTTTTAGCAATTTAAGAAACTTTAATTGTTT 

TTTCCACTTTTTTGTTTTTCTCCGAATTTTATGAAATTATGATTTAAGAAAAAAAACGAT 

ATTGTTCATGTATTGACCCTCTTACTGCATGGTTTCTTCTATTGGGTTAATTGGCTAGCT 

CATAAGAATTGTTTAATTTGGTTATT^ 

AAAT 

>G1075 Amino Acid Sequence (domain in AA coordinates: 78-85) 
MAGLDLGTTSRYVHNVDGGGGGQFTTDiRHHED^ 

NSGIiGGGGGGGSGDLVMRRPRGRPAGSKNKPKPPVIVTRESANTLRAHILEVGSGOT 
CISTYARRRQRGICVIjSGTGTVTNVSIRQPTAAGAVVTLRGTFEILSIjSGSFIjPPPAPPG 
ATS LTI FIjAGAQGQVVGGNVVGELMAAGPVMVMAAS FTNVAYERLPLDEHEEHIiQSGGGG 
GGGNMYSEATGGGGGLPFFNLPMSMPQIGVESWQGNHAGAGRAPF* 
>G1266 (62.. 718) 

C^TCCACTAACGATCCCTAACCGAAAAC^GAGTAGTCAAGAAACAGAGTATTTTTTCTA 
CATGGATCCT^TTTTTAATTCAGTCCCCATTCTCCGGCTTCTCACCGGAATATTCTATCGG 
ATCTTCTCCAGATTCTTTCTC^TCCTCTTCTTCTAACAATTACTCTCTTCCCTTCAACGA 
GAACGACTCAGAGGAAATGTTTCTCTACGGTCTAATCGAGCAGTCCACGCAACAAACCTA 
TATTGACTCGGATAGTCAAGACCTTCCGATCAAATCCGTAAGCTCAAGAAAGTCAGAGAA 
GTCTTACAGAGGCGTAAGACGACGGCCATGGGGGAAATTCGCGGCGGAGATAAGAGATTC 
GACTAGAAACGGTATTAGGGTTTGGCTCGGGACGTTCGAAAGCGCGGAAGAGGCGGCTTT 
AGCCTACGATCAAGCTGCTTTCTCGATGAGAGGGTCCTCGGCGATTCTCAATTTTTCGGC 
GGAGAGAGTTCAAGAGTCGCTTTCGGAGATTAAATATACCTACGAGGATGGTTGTTCTCC 
GGTTGTGGCGTTGAAGAGGAAACACTCGATGAGACGGAGAATGACCAATAAGAAGACGAA 
AGATAGTGACTTTGATCACCGCTCCGTGAAGTTAGATAATGTAGTTGTCTTTGAGGATTT 
GGGAGAACAGTACCTTGAGGAGCTTTTGGGGTCTTCTGAAAATAGTGGGACTTGGTGAAA 
GATTAGGATTTGTATTAGGGACCTTAAGTTTGAAGTGGTTGATTAA"l"'l"l"±AACCCTAATA 
TGTTTTTTGTTTGCTTAAATATTTGATTCTATTGAGAAACATCGAAAACAGTTTGTATGT 

ACTTTTGTGATACTTGGCG 

>G1266 Amino Acid Sequence (domain in AA coordinates: 79-147) 

MDPFLIQSPFSGFSPEYSIGSSPDSFSSSSSNNYSJjPFNENDSEEMFLYGIiIEQSTQQTY 

IDSDSQDLPIKSVSSRKSEKSYRGVRRRPWGKFAAEIRDSTRNGIRVWLGTFESAEEAAIi 
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AYDQAAFSMRGSSAILNFSAERVQESLSEIKYTYED^^ 
DSDFDHRSVKLDNVWFEDLGEQYLEEIiLGSSENSGTW* 
>G1311 (41.. 757) 

AAGTATAATAACACAAAGAAACAGAGTAAAAGAAAGAAAAATGGATTTTAAGAAGGAAGA 

AACACTTCGTAGAGGGCCATGGCTCGAAGAAGAAGACGAACGGCTAGTGAAGGTCATTAG 

TCTTTTGGGAGAACGTCGTTGGGATTCTTTAGCAATAGTTTCCGGTTTO 

TAAGAGTTGCAGGCTAAGGTGGATGAACTATCTGAATCCGACTCTGAAGCGTGGACCGAT 

GAGTCAAGAAGAAGAGAGAATCATCTTTCAGCTCCATGCTCTATGGGGTAACAAGTGGTC 

GAAGATTGCGAGAAGATTACCCGGTAGGACTGATAACGAGATAAAGAACTATTGGAGAAC 

TCATTATAGAAAGAAACAGGAAGCTCAAAACTATGGAAAGCrCTTTGAGTGGAGAGG 

TACAGGAG AAGAATTGTTG CACAAGT AT AAGGAAACAGAGATCACTAG GACAAAGACGAC 

GTCTCAAGAACATGGTTTTGTTGAAGTTGTGAGCATGGAAAGTGGTAAAGAAGCCAACGG 

TGGTGTTGGTGGAAGAGAAAGCTTCGGTGTTATGAAATCACCGTATGAAAATCGGATTTC 

GGATTGGATATCAGAGATTTCTACrGACCAGAGTGAAGCAAATCTTTCAGAAGATCACAG 

CAGCAATAGCTGCAGTGAGAACAATATTAACATTGGTACTTGGTGGTTTCAAGAGACTAG 

GGACTTTGAGGAGTTTTCATGTTCTCTATGGTCATAATTCTAAAGTTGGTTTATTTACTT 

TTTAAAAAAAAAAAAAAAAA 

>G1311 Amino Acid Sequence (domain in AA coordinates: 11-112) 
^FKKEETLRRGPWLEEEDERLVKVIS 

TIiKRGPMSQEEERI I FQLHALWGNKWSKIARRLPGRTDNEIKNYWRTHYRKKQEAQNYGK 
LFEWRGNTGEELLHKYKETEITRTK^ 

PYEITOISDWISEISTDQSEAl^SKDHSSNSCSENNINIGTWWFQETRDFEEFSCSLWS* 
>G1321 (72.. 803) 

GTTCTTGTATTGGTTTGGATCGGTATACTTAGTTGATTACGTAATTAAATAGATCGGCGT 
GAAGAAGAAAAATGATCATGTGCAGCCGAGGCCATTGGAGACCAGCTGAAGACGAGAAGC 
TCAAGGATCTTGTCGAACAATACGGTCCTCACAATTGGAACGCCATTGCTCTCAAGCTTC 
CTGGTCGCTCTGGTAAGAGTTGTAGATTGAGAT^ 

ACCGAAACCCTTTCACGGAAGAAGAAGAAGAAAGACTTTTAGCGGCTCATCGGATCCATG 

GGAACAGATGGTCCATCATCGCAAGGCTTTTCCCTGGAAGAACTGATAACGCCGTCAAGA 

ACCATTGGCACGTCATCATGGCTCGTCGCACACGCCAAACCTCTAAGCCTCGTCTTCTTC 

CCTCGACGACTTCGTCTTCTTCTTTAATGGCGAGTGAACAAATCATGATGAGTTCTGGTG 

GTTATAATCATAATTATAGTTCCGATGATCGGAAGAAAATATTTCCAGC^GACTTTATAA 

ATTTCCCTTACAAATTCTCTCATATCAATCATCTTCACTTCCTAAAGGAGTTTTT^ 

GAAAGATCGqTTTAAGT<^CAAAGC^^TCAGAGTAAGAAGCOTATGGAGTTCTA(^TT 

TTCTACAAGTAAACACAGATTCAAACAAGA^^ 

GCAAACGCAGTGACTCGGACACCAAAC3VTG 

CCGTTGGAAACTCTGCCTCCTAGGATTAGTTTTTTTGCAGTAACTCCTAAATTTCTAGAT 
TAACTATTTAGTCCGTATACGTACGAGATTATCTAGGTCGTTAGCiATGTATGCTTGATGT 
GTATAATCACTAACTAGTGAGCTATTACCTGCGAAAATTGTAAGAAAAATACATAATGTT 
GATGTATCACACATTCTCAATGTCTGTAAAATTTCCATCGAGTTGTTAACTATCAAAGTT 

ATCCGTTTGAAAAAAAAAAAA 

>G1321 Amino Acid Sequence (domain in AA coordinates: 4-106) 

MIMCSRGHWRPAEDEKLKDLVEQYGPHNWNAIALK^ 

FTEEEEERLLAAHRIHGNRWSIIARLFPGRTDNAVKNHW^ 

SSSSLMASEQIMMSSGGYNHI^SSDDRKOFPADFII^PYKFSHINHIjHFIjKEFFPGKIA 
LSHKANQSKKPMEFYNFLQVlTroSNKSEIIDQDSGQSKRSDSDTKHESHVPFFDFIjSVGN 

SAS* 

>G1326 (32.. 784-) 

CGACGGTACGGTGGAGATAGAGATAGCATCCATGGAGATGTCTAGAGGAAGCAACAGTTT 

TGACAATAAGAAGCCTAGTTGCCAAAGAGGTCACTGGAGACCTGTTGAAGATGACAATCT 

CCGGCAACTCGTTGAACAATACGGTCCCAAGAACTGGAATTTTATTGCTCAACATCTCTA 

TGGAAGATCAGGGAAAAGCTGTAGATTAAGATGGTACAACCAACTTGATCCAAACATCAC 

C^GAAACCCTTCACCGAGGAGGAAGAAGAGAGACTGCTTAAAGCTCATCGGATCCAAGG 

GAATCGTTGGGCCTCCATAGCCCGACTGTTCCCCGGGAGGACCGACAACGCTGTCAAAAA 

CCATTTTCATGTCATCATGGCTAGACG 

TACGTTC^C(^^CTTGGCATACTC^ 

TAGATCCCATTTCGGGCTATGGAGGTATCGAAAGGATAAGAGTTGCGGTCTCTGGCCTTA 
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CTCTTTTGTTTCACCACCTACGAATGGTCAATTTGGATCTTCATCTGTCTCTAAC6TACA 
C(^CGAAATTTATCTTGAGAGGAGAAAGTCG7^GAGTTGGTGGATCCTCAGAATTACAC 
ATTTCATG CAG CCACACCAG ATC ATAAGATGACTTC AAATGAAGATGGAC CAT C CATGGG 
AGATGATGGTGAGAAGAACGATGTTACTTTCATTGATTTTCITGGTGTTC 
TTAGGTTATAACATCACAAGTCAAAGCTTTTAAGGGTTTCTATCATTAGGGTTAGGCATC 

ATTTTCAGCCTTTTGCTTCCTTAAACTCTCATATGGATCT 

>G1326 Amino Acid Sequence (domain in AA coordinates: 18-121) 
MEMSRGSNS FDNKKPSCQRGHWRPVEDDNIjRQIjVEQYGPKNWNF I AQHL YGRSGKS CRLR 
WYNQLDPNITKKPFTEEEEERIjLKAHRIQGNRWAS iarlfpgrtdnavt<ithfhvimarrk 
RENFSSTATSTFNQTWHTVIjSPSSSLTRIjiroSHFGLV^YRKDKSCGLWPYSFVSPPTNGQ 
FGSSSVSNVHHEIYLERRKSKEIiVDPQNYTFHAATPDHKMTSNEDGPSMGDD 

IDFLGVGLAS* 
>G1367 (128. .1567) 

TCCTTCCACAAAACTTTTTTAATTTTATCTGAAAAATTAA^ 

AAAACTAAAAATCAAAAATCTC^TCACCTTCCTTGCTCTGTATTTTTTCTCTCTCACTAA 

ATCCTCCATGGATCCTTCTCTCTCTGC^ 

GTTC^CATCTTTCCCTCCTTTCACCM^^ 

CTTCZACCGGACCCACCGCCGTCGCGCCGCCAAACAACATCCATCTCTATCAAGCAGCTCC 
TCCGCAGC^GCC^CAAA<^TCTCC^GTTCCT^ 

CTCTGACATGATTTGCA.CGGCGATTGCAGCGTTAAACGAACCAGATGGGTCAAGCAAGCA 

AGCTATTTCGAGGTACATAGAGAGAATTTACACTGGGATTCCTACTGCTCATGGAGCTTT 

GTTGACACACCATCTCAAGACTTTGAAGACCAGTGGGATTCTTGTCATGGTTAAGAAATC 

TTACAAGCTTGCTTCTACTCCTCOTCCTCCTCCTCCTACTAGTGTAGCTCCTAGTCTTGA 

ACCTCCCAGATCTGATTTCATAGTCAACGAGAACCAACOTTTACCTGATCCGGT^ 

TTCTTCTACTCCTCAGACTATTAAACGTGGTCGTGGTCGACCTCCAAAAGCTAAACCAGA 

TGTTGTTCAACCTCAACCTCTGACTAATGGAAAACTCACCTGGGAACAGAGTGAATTACC 

TGTCTCTCGACCAGAGGAGATACAGATACAGCCGCCACAGTTACCGTTACAGCCACAGCA 

GCCGGTTAAGAGACCGCCGGGTCGTCCTAGAAAAGATGGAACTTCGCCGACGGTGAAGCC 

AGCTGCTTCTGTTTCCGGTGGTGTGGAGACTGTGAAACGAAGAGGTAGACCTCCGAGTGG 

AAGAGCTGCTGGGAGGGAGAGAAAGCCTATAGTAGTCTC^GCTCC^GCrTCAGTGTTCCC 

GTATGTTGCTAATGGTGGTGTTAGACGCCGAGGGAGACCAAAGAGAGTTGACGCTGGTGG 

TGCTTCCTCTGTTGCTCCACCACCACCACCACCAACTAACGTAGAGAGTGGAGGAGAGGA 

GGTTGCAGTCAAGAAACGAGGAAGAGGACGGCCTCCTAAGATTGGAGGTGTTATCAGGAA 

GCCTATGAAGCCGATGAGAAGCTTTGCTCGTACTGGAAAACCCGTAGGAAGACCCAGAAA 

GAATGCGGTGTCAGTGGGAGCTTCTGGACGACAAGATGGTGACTATGGAGAACTGAAGAA 

GAAGTTTGAGTTGTTTCAAGCGAGAGC^AAGGATATTGTAATTGTGTTGAAATCCGAGAT 

AGGAGGAAGTGGAAATCAAGCAGTGGTTCAAGCCATACAGGACCTGGAAGGGATAGCAGA 

GACAAGAAACGAGCCAAAGCACATGGAAGAAGTGCAGCTGCCAGACGAG 

AACCGAACCAGAAGC7VGAGGGTCAAGGACAGACAGAAGCAGAGGCAATGCAAGAAG 

GTTCTAAAGATAAAGCCTTGACATAAAAAGCTAGCAAGTGGTGGGTTTACTTGTTGTGTG 

TTACATGAAATTTTTAATCTTAT^ 

GATGAACTGATGATGATGATTGTGTCTCTAACCAAACAACAAGGAGAGGTAGGGTAATGT 
CTGTAAAGTGAATTAGGATGTTACCATTGTTCATGCTTCCCATCTCTCTCCATCGTCCAT 
ATCTGTGTAGGCAGCTTTGTTCTTTGTTCCCTCGTGTTTTTTTTAGACTGTTGTGTCTCT 
TATTCTATTTTGTCTCCTTAGGCTTTTTAGGAGTTGTTGTTGATGTTTATCAAAAACGCT 
TATGTAATTTTTATGACCACTTCTACTTTTTATGATGGTTTCTT 

>G1367 Amino Acid Sequence (domain in AA coordinates: 179-201, 262-285, 298-319, 
335-357) _ 

MDPSLSATNDPHHPPPPQFTSFPPFTNTNPFASPNHPFFTGPTAVAPPNNIHLYQAAPPQ 
QPQTSPVPPHPSISHPPYSDMICTAIAAIiNEPDGSSKQAISRYIBRlYTGIPTAHGALLT 
HHLKTLKTSGILVMVKKSYKIiASTPPPPPPTSVAPSLEPPRSDFIW 

TPQTIKRGRGRPPKAKPDWQPQPLTNGKIiTWEQSELPVSRPEEIQIQPPQLPLQPQQPV 
KRPPGRPRKDGTSPTVKPAASVSGGVETVKRRGRPPSGRAAGRERKPIVVSAPASVFPYV 
ANGGVRRRGRPKRVDAGGASSVAPPPPPPTNVESGGEEVAVKKRGRGRPPKIGGVIRKPM 
KPMRS FARTGKPVGRPRKNAVS VGASGRQDGD YGELKKKFEIjFQARAKD I VI VLKS E I GG 
SGNQAVVQAIQDLEGIAETTI^PKHMEEVQLPDEEHIiETEPEAEGQGQTEAEAMQEAIjF* 

>G1386 (89.. 673) 
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AATTTTATTTCCTTCTCTGAAATC 

TCCCTTTTAAAAGAAAATATCCCAATTAATGGAACGTGACGACTGCCGGAGATTTCIAGGA 

CTCGCCGGCGCAGACGACGGAGAGAAGAGTGAAATATAAACGAAAGAAGAAAAGAGCCAA 

AGATGATGATGATGAGAAAGTTGTTTCGAAGCATCCAAATTTTCGAGGTGTCAGAATGAG 

ACAATGGGGAAAATGGGTGTCCGAAATCAGAGAGCCAAAAAAGAAATCAAGAATCTGGCT 

CGGTACTTTCTCCACGGCGGAGATGGCGGCGCGTGCTC^CGACGTGGCAGCTTTAGCCAT 

CAAAGGCGGTTCTGCACATCTCAACTTCCCGGAGCTCGCTTATCACCTCCCTAGACCAGC 

TAGTGCCGACCCTAAAGACATCCAAGCTGCCGCCGCCGCAGCTGCAGCCGCTGTGGCCAT 

TGACATGGATGTAGAGACGTCTTCGCCGTCGCCATCTCCCACAGTTACGGAAACGTCATC 

TCCGGCTATGATAGCACTCTCCGACGACGCGTTCTCCGATCTTCCTGATCTCTTGCTCAA 

CGTGAACCATAACATCGATGGCTTCTGGGACTCTTTTCCCTATGAAGAACCCTTCCTCTC 

TCAAAGTTACTAGAAACTCATU^CTATGTCGTTTTTGTATGTATTTTTGTCATGTGACCA 

TTTTTTGACGTCGAAAATCACCCGGATAATCCAAATTGTATGATTTATTAATGGTTGATG 

ATTTTCTTTGTGTGGAACAATGTGTATGATACGTAATC^^ 

AAAAA 

>G1386 Amino Acid Sequence (domain in AA coordinates: TBD) 
MEI^DCRI^QDSPAQTTERRVKYKPKKKRAKDDDDEKVVS KHPNFRGVRMRQWGKWVSEI 
REPKKKSRIVn^TFSTAEMAARAHDVAALAI 

AAAAAAAAVAIDMDVETS S PS PS PTVTETS S PAMXALSDDAFSDLPDLLLNVNHNIDGFW 

DSFPYEEPFLSQSY* 

>G1421 (292. .1155) 

GAAATTTCATCCCTAAATAAGAAAAAAGCATCTCCTTCTTTAGTGTCCTCCTTCACCAAA 
CTCTTGATTCCATAAGCATATATTAAAAAAGCTCTCTGCTTTCTTCAACTTTCCCGGGAA 
AATCTTCTTGTTACAAAGCATCAATCTCT^ 
TTTGCCCTTTACTTTTCCTAACTTTGGTCT^ 
C^C^CATAAGTTAAAACTATTACAAC^^ 
GAGAAGAAAGTTTCTCTCCCAAGAATCTTACGAATC 
GATTCGTCAAGCGACGAAGAAGAAGAAGTTGATTTTGATGC^TTATCT 
* CGTGTTAAGAAGTACGTGAAGGAAGTGGTGCTTGATTCGGTGGTTTCTGATAAAGAGAA 
CCGATGAAGAAGAAGAGAAAGAAGCGCGTTGTTACTGTTCC^VGTGGTTGTTACGACGGCG 
ACGAGGAAGTTTCGTGGAGTGAGGCAAAGACCGTGGGGAAAATGGGCGGCGGAGATTAGA 
GATCCGAGTAGACGTGTTAGGGTTTGGTTAGGTACTTTTGACACGGCGGAGGAAGCTGCC 
ATTGTTTACGATAACGCAGCTATTCAGCTACGTGGTCCTAACGCAGAGCTTAACTTCCCT 
CCTCCTCCGGTGACGGAGAATGTTGAAGAAGCTTCGACGGAGGTGAAAGGAGTTTCGGAT 
TTTATCATTGGCGGTGGAGAATGTCTTCGTTCGCCGGTTTCTGTTCTCGAATCTCCGTTC 
TCCGGCGAGTCTACTGCGGTTAAAGAGGAGTTTGTCGGTGTATCGACGGCGGAGATTGTG 
GTTAAAAAGGAGCCGTCTTTTAACGGTTCAGATTTCTCGGCGCCGTTGTTCTCGGACGAC 
GACGTTTTTGGTTTCTCGACGTCGATGAGTGAAAGTTTCGGCGGCGATTTATTTGGAGAT 
AATCTTTTTGCGGATATGAGTTTTGGATCCGGGTTTGGATTCGGGTCTGGGTCTGGATTC 
TCCAGCTGGCACGTTGAGGACCATTTTCAAGATATTGGGGATTTATTCGGGTCGGATCCT 
GTCTTAACTGTTTAAGAAATAACTGGCCGTTTAACGGCGTTTAGTGAAGTTTTGTTACCG 
GCGACGGCGAGGATTAAAAAAAAACGGCGATTTAl u l l l u rTlGAATGAAGATTTGTTAAATA 
>G1421 Amino Acid Sequence (domain in AA coordinates: 74-151) 
METEKKVS L PR I LR I S VTD P Y ATDS S SDEEE EVD FDALS TKRRRVK3CYVT<E VVLD S WS D 
KEKPMKKKRKKRVVTVPVVVTTATRKFRGVRQRPWGK^ 

EAAIVYDNAAIQLRGPNAELNFPPPPVTENVEEASTEVKGVSDFIIGGGECI1RSPVSVI1E 
SPFSGESTAVKEEFVGVSTAEIWKKEPSFNGSDFSAPLFSDDDVFGFSTSMSESFGGDIi 
FGDNLFADMSFGSGFGFGSGSGFSSWHVEDHFQDIGDLFGSDPVLTV* 
>G1453 (39. .917) 

CGTCGACGCGAAATAAATCCTAGAAAATAACTATC^VATATGATGAAGGTTGATCAAGATT 
ATTCGTGTAGTATACCGCCTGGATTTAGGTTTCATCCGACAGATGAAGAACTTGTCGGAT 
ATTATCTCAAGAAGAAAATCGCCTCCCAGAGGATTGATCTCGACGTTATCAGAGAAATTG 
ATCTTTACAAGATCGAACCATGGGATCTACAAGAGAGATGTAGGATAGGGTACGAGGAGC 
AAACGGAGTGGTATTTCTTCAGCCATAGAGAGAAGAAGT^ 

ACCGAGCCACCGTGGCCGGTTTCTGGAAAGCAACGGGCCGGGACAAGGCGGTTTACCTCA 
ACTCCAAACTTATCGGTATGAGAAAAACGCTTGTCTTTTACCGAGGTCGAGCGCCTAATG 
GCCAAAAGTCCGATTGGATC^TTCACGAATACTACAGCCTCGAGTCACACCAGAACTCTC 
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CTCCACAGGAAGAAGGATGGGTAGTGTGTAGAGCATTTAAGAAACGAACGACCATCCCAA 
CAAAAAGGAGGCAACTTTGGGATCCGAACTGCTTATTCTACGACGACGCCACTCTCTTGG 
AACCTCTCGACAAGCX3AGCCAGACATAATCCTGATTTTACCGCCACACCGTTCAAGCAAG 
AACTACTCTCCGAGGCCAGTCACGTCCAGGATGGAGATTTCGGATCTATGTACCTTCAAT 
GCATCGATGATGATCAATTCTCCCAGCTTCCTCAGCTCGAGAGCCCCTCTCTTCCGTCGG 
AAATAACTCCCCATAGTACTACTTTTTCTGAGAACAGTAGCCGGAAAGATGACATGAGCT 
CCGAGAAGAGGATCACTGACTGGAGATATCTAGATAAGTTCGTGGCGTCTCAATTTTTGA 
TGAGTGGAGAAGACTAAAAAAGGCTTTCCTATGCATGCATGCACTAGAAACGTCGTCGCA 

TTTTGGATTTACATGCGGCCGCT 

>G1453 Amino Acid Sequence (conserved domain m AA coordinates : 13-160) 

MMKVT5QDYSCSIPPGFRFHPTDEELVGYYLKKKIASQRIDLDVIREIDLYKI 

CRIGYEEQTEWYFFSHRDKOTPTGTRT^ 

YRGRAPNGQKSDWI IHEYYSLESHQNSPPQEEGWWCRAFKKRTTI PTKRRQLWDPNCLF 
YDDATLLEPIjDKRARHNPDFTATPFKQELLSEASHVQDGDFGSMYIjQCIDDDQFSQIjPQL 
ESPSLPSEITPHSTTFSENSSRKDDMSSEKRITDWRYLDKFVASQFLMSGED* 

>G1560 (120.. 1340) 

ATCCTTTCAATTTCCACTCCTCTCTAATATAATTCACATTTTCCCACTATTGCTGATTCA 

TTTTTTTTTGTGAATTATTTCAAACCCACATAAAAAAATCTTT^ CA 
TGGATCCTTCATTTAGGTTCATTAAAGAGGAGTTTCCTGCTGGATTCAGTGATTCTCCAT 
CACCACCATCTTCTTCTTCATACCTTTATTCATCTTCCATGGCTGAAGCAGCCATAAATG 
ATCCAACAACATTGAGCTATCCACAACCATTAGAAGGTCTCCATGAATCAGGGCCACCTC 
CATTTTTGACAAAGACATATGACTTGGTGGAAGATTCAAGAACCAATC^TGTCGTGTCTT 
GGAGCAAATCCAATAACAGCTTCATTGTCTGGGATCCACAGGCCTTTTCTGTAACTCTCC 

TTCCCAGATTCTTCAAGCACAATAACTTCTCCAGTTTTGTCCGCCAG^ 
GTTTCAGAAAGGTGAATCCGGATCGGTGGGAGTTTGCAAACGAAGGGTTTCTTAGAGGGC 
AAAAGCATCTCCTCAAGAACATAAGGAGAAGAAAAACAAGTAATAATAGTAATCAAATGC 
AACAACCTCAAAGTTCTGAACAACAATCTCTAGACAATT^ 

ACGGTCTAGATGGAGAGATGGACAGCCTAAGGCGAGACAAGCAAGTGTTGATGATGGAGC 
TAGTGAGACTAAGACAGCAACAACAAAGCACCAAAATGTATCT 

AGCTCAAGAAGACCGAGTCAAAACAAAAACAAATGATGAGCTTCCTTGCCCGCGCAATGC 
AGAATCCAGATTTTATTCAGCAGCTAGTAGAGCAGAAGGAAAAGAGGAAAGAGATCGAAG 
AGGCGATCAGCAAGAAGAGACAAAGACCGATCGATCAAGGAAAAAGAAATGTGGAAGATT 
ATGGTGATGAAAGTGGTTATGGGAATGATGTTGGAGCCTCATCCTCAGCATTGATTGGTA 
TGAGTCAGGAATATACATATGGAAACATGTCTGAATTCGAGATGTCGGAGTTGGACAAAC 
TTGCTATGCACATTCAAGGACTTGGAGATAATTCCAGTGCTAGGGAAGAAGTCTTGAATG 
TGGAAAAAGGAAATGATGAGGj^GAAGTAGAAGATCAACAACAAGGGTACCATAAGGAG^ 
ACAATGAGATTTATGGTGAAGGTTTTTGGGAAGATTTGTTAAATGAAGGTCAAAATTTTG 
ATTTTGAAGGAGATCAAGAAAATGTTGATGTGTTAATTGAGCAACTTGGTTATTTGGGTT 
CTAGTTCACACACTAATTAAGAAGAAATTGAAATGATGACTACTTTAAGCATTTGAATGA 
ACTTGTTTCCTATTAGTAATTTGGCTTTGTTTCAATCAAGTGAGTCGTGGACTAACTTGC 

>G1560 Amino Acid Sequence (domain in AA coordinates: 62-151) 
MDPSFRFIKEEFPAGFSDSPSPPSSSSYLYSSSMAEAAINDPTTLSYPQPLEGLHESGPP 

PFLTKTYDIjVEDSRTNHVVSWSKSNNSFIvW)PQAFSvT^ 

GFRKVIJPDRWEFANEGFLRGQKHIjLKNIRRRKTSNNSNQMQQPQSSEQQSLDNFCIEVGR 
YGLDGEMDSLRRDKQVLMMELV^ 

QNPDFIQQIjV^QKEKKKEIEEAISKKRQRPIDQGKRNVEDYGDESGYGNDVAASSSALIG 

MSQEYTYGNMSEFEMSELDKIiAMHIQGLGDNSSAREEVI^ 
NNEIYGEGFWEDLI^EGQNFDFEGDQEITTOVLIQQLGYIiGSSSHTO 

>G1594 (1. .984) 

ATGGATGGAATGTACAATTTCCATTCGGCCGGTGATTATTCAGATAAGTCGGTTCTGATG 

atgtcaccggagagtctcatgtttccttccgattaccaagctttgctatgttcctccgcc 
ggtgaaaatcgtgtctctgatgttttc^ 

gctttgtcgtcggaggcggcttcgatcgctccggagatccgaagaaatgatgataacgtt 
tctctaactgtc^tcaaagctaaaatcgcttgtcatccttcgtatcctcgcttacttcaa 
gcttacatcgattgcc^aaaggtcggagcacc^ccggagatagcgtgtttactagaggag 
attc^cgggagagtgatgtttataagcaagaggttgttccttcttcttgctttggagot 
gatcctgagcttgatgaatttatggaaacgtactgcgatatattagtgaaatacaaatcg 
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GATCTAGCAAGACCGTTTGACGAGGCAACGTGTTTCTTGAACAAGATTGAGATGCAGCTA 
CGGAACCTATGTACTGGTGTCGAGTCTGCCAGGGGAGTTTCTGAGGATGGTGTAATATCA 
TCTGACGAGGAACTGAGTGGAGGTGATCATGAGGTAGCAGAGGATGGGAGACAAAGATGT 
GAAGACCGGGACCTC^UU^GATAGGTTGCTACGCAAATTTGGAAGCCGTATTAGTACTTTA 
AAGCTTGAGTTCTCAAAGAAGAAGAAGAAAGGAAAGTTACCAAGAGAAGCAAGACAAGCT 
CTTCTTGATTGGTGGAATCTCCATTATAAGTGGCCTTACCCTACTGAAGGAGATAAGATA 
GCATTAGCTGATGCAACGGGGTTAGACCAAAAAC 

AGGAAACGTCATTGGAAGCCATCAGAGAATATGCCTTTCGCTATGATGGATGATTCTAGT 
GGATCATTCTTTACCGAGGAATGA 

>G1594 Amino Acid Sequence (conserved domain in AA coordinates : 343-308) 
MDGMYNFHSAGDYSDKSVIiMMSPESI^ 

ALSSEAASIAPEIRROTDWSLTVIKAKIACHPSYPRLLQAYIDCQKVGAPPEIACLIiEE 
IQRESDVYKQEVVPSSCFGADPEIiDEFMETYOTILVKYKSDI^ 

RNLCTGVESARGVSEDGVISSDEELSGGDHEVTUBDGRQRCEDRDLKDRIiIjRKFGSRISm 
KLEFSKKKKKGKLPREARQALIjDWWNIiHYKWPYPTEGDK^ 
RKRHWKPSENMPFAMMDDSSGSFFTEE* 
>G1750 (94.. 1101) 

CCCTTTTCCTCTCTTTCTCCAAATCTCTGAAA 

TGACCAGAATCTCTCTGTTTAAAATAATAGGTGATGATGATGGATGAGTTTATGGATCTT 
AGACCAGTGAAGTAC^C^GAGCACAAGACTGTTATC^GAAAGTACACTAAAAAGTCGTCT 
ATGGAGAGGAAGACCAGTGTTCGTGACTCGGCCAGGTTGGTTCGGGTCTCAATGACGGAT 
CGTGACGCCACTGATTCATCAAGCGACGAGGAAGAGTTTCTGTTCCCTCGAAGACGTGTC 
AAGAGATTGATTAACGAGATC^GAGTCGAGCCTAGCAGCTCTTCCACCGGCGACGTCTCT 
GCTTCTCCGACGAAGGACCGGAAAAGAATCAACGTTGATTCTACGGTTCAAAAGCCCTCT 
GTTTCCGGCCAZ^ACCAGAAGAAGTACCGCGGCGTGAGAG^GCGACCATGGGGAAAATGG 
GCGGCGGAGATTCGTGATCCTGAGCAACGCCGGAGAATCTGGCTCGGTACTTTTGCAACG 
GCGGAGGAAGCTGCCATCGTCTACGACAACGCAGCAATCAAACTTCGTGGCCCTGATGCT 
C TTACCAACTTCACCGTACAAC CAGAACCAGAAC CGGTACAAGAACAAGAACAAGAACCG 
GAGAGCAACATGTCGGTTTCGATATCAGAATCAATGGACG^ 

CCGACATCGGTTCTCAACTACCAAACATATGTCTCGGAGGAACCAATCGATAGTCTTATC 
AAACCGGTTAAACAAGAGTTTCTTGAACCAGAAC^^ 

GAAGGTAATACTAATACTAATGATGATTCATTTCC^TTGGACATTACATTTCTCGACAAC 
TATTTCAATGAATCATTACCAGACATCTCCATCT 

CCAAC^GAGAATGATTTCTTCAACGACCTTATGTTATTCGATAGCAACGCAGAAGAATAC 
TACTCCTCCGAGATCAAAGAGATTGGTTCATCGTTGAACGATCTTGATGATTCTTTGATA 
TCCGATCTCTTACTTGTGTGATATTTTTGCCATTAACCAAACACCGGTTTGGTTGC 
>G1750 Amino Acid Sequence (domain in AA coordinates: 107-173) 
^#!MDEFMDLRP VTCYTEHKTVIRKYTKKS SMERKTSVRDSARLVRVSMTDRDATDS S SDEE 
EFLFPRRRVKRLINEIRVBPSSSSTGDVSASPTKDRKRIN^ 

VRQRPWGKWAAE IRDPEQRRRI WLGTF ATAEEAAI VYDNAAIKLRGPDALTNFTV 
PVQEQEQEPESNMSVSISESiynDDSQHLSSPTSVXNYQTYv^^ 

QEPISWHLGEGNTNTNDDSFPLDITFLDNYFNESLPDISIFDQPMSPXQPTENDFFNDLM 

LFDSNAEEYYSSEIKEIGSSFNDLDDSIjISDLLLV* 

>G1947 (70.. 918) 

ACAACTATTCTCTCCTCTCTCTTTTTTTAT^ 
GTTCACAAAATGGATTATAACCTTCC^UVT 

ACGGCTTTCTTGACGAAAACATACAACATAGTGGAGGATTCAAGCACAAACAACATAGTT 

TCATGGAGCAGAGA€AACAACAGCTTCATTG1TTGGGAACCAGAGACTTTTGCCCTAA 

TGCCTCCCTAGATGCTTTAAGCACAATAATTTCTCCAGCTTTGTTAGACAGCTCAATACT 

TATGGGTTTAAGAAGATTGATACAGAGAGATGGGAATTTGCAAATGAGCATTTTCTGA 

GGAGAGAGGCATCTTCTTAAGAACATCAAGAGAAGAAAGACATGATCTCAAA^ 

CAGTCGCTAGAAGGAGAGATCCATGAGCTGCGAAGAGACAGAATGGCTTTAGAAGTAGAA 

CTGGTTAGACTGCGACGAAAACAAGAAAGCGTGAAGACATATCTGCATTTGATGGAAGAG 

AAACTGAAAGTCACAGAAGTAAAGCAAGAAATGATGATGAATTTCTTGCTAAAGAAGATT 

AAGAAACCGAGTTTTTTACAGAGCTTAAGGAAACGTAATCTGCAAGGAATCAAGAATCGA 

GAGCAAAAGCAAGAGGTGATCTCAAGCCATGGTGTTGAGGATAATGGAAAGTTTGTTAAA 

GCTGAGCCAGAAGAGTATGGTGATGACATCGATGATCAATGTGGAGGTGTGTTTGATTAT 
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GGTGATGAGCTTCACATAGCTTCAATGGAGCATCAAGGACAAGGGGAGGATGAAATTGAA 

ATGGATAGTGAAGGAATTTGGAAGGGTTTCGTGTTGAGTGAGGAGGAGATGTGTGATTTA 

GTGGAACATTTTATATAATAAACTAATGTATTATGAGAGGTTTTTTTTTGTTTTTTTGCT 

TTTTTTTTCCGAGTTTGTCATCAAGCATTGTATACAATTTGGGCCA^ 

CAAAATATTTGGCCTTGGCATTTGTTAACAAATTGACTAATTCGGCCACACCTTCC 

>G1947 Amino Acid Sequence (domain in AA coordinates: 37-120) 

MDYIHjPIPLEGLKETPPTAPLTKTYNIVEDSSTNNIVSWSRDNNSFIVWEPETFALiICLP 

RCFKHEnSTFSSFWQLNTYGFKKIDra 

EGEIHELRRDRMALEVELVRLRRK^ 

S FIjQ S LRKRNLQGI KNREQKQEVI S SHGVEDNGKFVKAEPEEYGDDIDDQCGGVFD YGDE 

LHIASMEHQGQGEDEIEMDSEGIWKGFVXjSEEEMCDIiVEHFI* 

>G2011 (309. .1547) 

AATGTCGGTTGTACAATTATTTGTCACTAAAGTTTCCAAATTTCTTCTAAACTGATGAAT 
CAATGGAACATGATGACGAAAAAGATAAATCCACGGTGGCGGGAACTGACCCACCCATTT 
CCACCGCCTCTCTATTCCCGAGATTTTTT^ 

TCCTTCCCTAAACCTTTATAAAC(^TTAAACCTCTCATCCTTCTTCTCTTAAACCCCCTA 

ATTATCACACACACCCG^TTTCTCACT 

TATATCAAATGAGCCCAAAAAAAGATGCTGTTTC^ 

TTTCGAGACGATCCGATATACCCGGGTOTCTCTACGTCGACACTGACATGGGTTTCTCTG 

GGTCACCACTTCCCATGCCACTAGACATCTTACAAGGGAATCCAATTCCACCTl" 

CCAAGACTTTTGATTTGGTTGATGACCCGACTCT^GACCCGGTCTVTCTCTTGGGGACTGA 

CCGGAGCTAGCTTCGTAGTTTGGGATCCTCTAGAGTTTGCCAGAATCATACTTCCAAGGA 

ATTTCAAACACAACAATTTCTCCAGCTTCGTCAGACAGCTTAACACTTATGGATTTCGAA 

AGATTGATACTGACAAGTGGGAATTCGCTAACGAGGCTTTCCTTAGAGGCAAGAAGCATC 

TTCTGAAGAACATTCATCGTCGTCGATCACCACAATCC^ 

CTAGCCAAAGCCAAGGGTCACCTACTGAGGTTGGAGGAGAGATTGAGAAGCTGAGGAAAG 

AGCGGCGTGCZATTGATGGAGGAAATGGTTGAGCTTCAGCAGGAAAGCAGAGGCAC^ 

GAC^TGTGGACACTGTAAACCAGAGGCTGAAAGCTGCAGAGCAACGTCAGAAGCAATTGC 

TCTCTTTCTTGGCTAAGTTGTTTGAGAACCGGGGTTTCTTGGAACGCCTGAAGAACTTCA 

AAGGAAAAGAAAAAGGAGGAGCTCTTGGATTGGAAAAGGCGAGAAAGAAGTTCATCAAGC 

ACCACCAGCAGCCTCAAGATTCTCCAACAGGAGGGGAGGTGGTGAAGTATGAAGCTGATG 

ATTGGGAGAGATTGCTAATGTATGACGAAG^GACTGAGAACACCAAGGGTTTAGGAGGGA 

TGACTTCAAGCGATCCAA^GGCAAGAACTTGATGTATCCATCAGAAGAAGAGATGAGCA 

AACCAGATTACTTGATGTCCTTCCCATCTCCTGAAGGACTTATTAAACAAGAAGAGACGA 

CATGGAGCATGGGTTTCGATACTACAATACCGAGT^^ 

ACACAATGGACTATAATGATGTCTCAGAGTTTGGTTTTGCTGCAGAAACAACAAGTGATG 

GTTTGCCTGATGTCTGCTGGGAACAATTTGCTGCAGGAATCACAGAGACTGGATTCAACT 

GGCCAACTGGTGATGATGATGATAATACGCCAATGAATGATCCTTAGGATCTTTTCATAT 

ATAGTTTAGACCAAAAACCCGTTTCTTATCGGGTGAACTATTAATTCATTATTCATTTTG 

AATGCACTCTTTATAGATATATATAATATTGATGAGTTTGATTGTTCCAAAAAAAAA 

>G2011 Amino Acid Sequence (domain in AA coordinates: 56-147) 

MSPKKDAVSKPTPISVPVSRRSDIPGSIiTTOTDMGFSGSPLPMPLDILQGITPIPPFIiSICr 

FDLVDDPTLDPVI S WGLTGAS FWWDPLEFARI ILPRNFKHNNFS SFVRQLNTYGFRKID 

TDKWEFANEAFLRGKKHLLKNIHRRRSPQSNQTCCSSTSQSQGSPTEVGGEIEKLRKERR 

ALMEEMVEIiQQQS RGTARHVDTVNQRLKAAEQRQKQLLS FLAKLFQNRGFLERIiKNFKGK 

EKGGALGLEKARKKF I KHHQQPQDS PTGGEVVKYE ADDWERLLMYDEETENTKGIjGGMTS 

SDPKGKl^^TyPSEEEMSKPDYLMSFPSPEGLIKQEETTWSMGFDTTIPSFSNTDAWGNTM 

DYM^VSEFGFAAEl^SDGLPDVCWEQFAAGITETGFNWPTGDDDDNTPMNDP* 

>G2094 (1..450) 

ATGCTAGATCCCACCGAGAAAGTAATCGATTCAGAATCAATGGAAAGCAAACTCACATCA 
GTAGATGCGATCGAAGAA<^CAGCAGCAGTAGCAGTAATGAAGCTATCAGCAACGAGAAG 
AAGAGTTGTGCCATTTGTGGTACCAGCAAAACCCCTCTTTGGCGAGGCGGTCCTGCCGGT 
CCCAAGTCGCTTTGTAACGCATGCGGGATCAGAAACAGAAAGAAAAGAAGAACACTGATC 
TCAAATAGATCAGAAGATAAGAAGAAGAAGAGTCATAACAGAAACCCGAAGTTTGGTGAC 
TCGTTGAAGCAGCGATTAATGGAATTGGGGAGAGAAGTGATGATGCAGCGATCAACGGCT 
GAGAATCAACGGCGGAATAAGCTTGGCGAAGAAGAGCAAGCCGCCGTGTTACTCATGGCT 
CTCTCTTATGCTTCTTCCGTTTATGCTTAA 
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>G2094 Amino Acid Sequence (domain in AA coordinates : 43-68) 
MIiDPTBKVIDSESMESKLTSVDAIEEHSSSSSNEAI SNEKKSCAICGTSKTPLWRGGPAG 
PKSLCNACGIRITOKKRRTLISNRSEDKiaaCSHNRNPKFGDSL 
ENQRRNKLGEEEQAAVXjLMALS YAS S VYA* 
>G2113 (90. .590) 

ATAACAAACTCATCAAACTTCCTCAGCGTTT 

ATAAAGAAATCTTGTTGTTTGTTGTTGTC^TGGCACCGACAGTTAAAACGGCGGCCGTCA 
AAACCAACGAAGGTAACGGAGTCCGTTACAGAGGAGTGAGGAAGAGACCATGGGGACGTT 
ACGCAGCCGAGATCAGAGATCCTTTCAAGAAGTCACGTGTCTGGCTCGGTACTTTCGACA 
CTCCTGAAGAAGCCGCTCGTGCCTACGACAAACGTGCTATTGAGTTTCGTGGAGCTAAAG 
CCAAAACCAACTTCCCTTGTTACAAGATCAACGCCCACT^ 

TGAGCCAGAGCAGCACCGTGGAATCATCGTTTCCTAATCTCAACCTCGGATCTGACTCTG 
TTAGTTCGAGATTCCCTTTTCCTAAGATTCAGGTTAAGGCTGGGATGATGGTGTTCGATG 
AAAGGAGTGAATCGGATTCTTCGTCGGTGGTGATGGATGTCGTTAGATATGAAGGACGAC 
GTGTGGTTTTGGACTTGGATCTTAATTTCCCTCCTCCACCTGAGAACTGATTAAGATTTA 
ATTATGATTATTAGATATAATTAAATGTTTCTGAATTGAG 

>G2113 Amino Acid Sequence (domain in AA coordinates: TBD) 
MAPTVKTAAVKTNEGNGVRYRGVRKRPWGRYAAEIRD 

KRAIEFRGAKAKTNFPCYNINAHCIiSLTQSLSQSSTVESSFPNIiNIiGSDSVSSRFPFPK^ 

QVKAGMMVFDERSESDSSSVVMDVVRYEGRRVVIjDIjDIjNFPPPPEN* 

>G2115 (41. .733) 

AATCACTCTACAAAGCCTGTACGTAC^C^^ 

GATCCAAACCAGCAGCACAAAAAAGGAAATGCCTTTGT 

TTCTTCATCTTCTTCCTCGTCTTCGTCra 

TAAGAAGTACAAAGGAGTGAGGATGAGAAGTTGGGGATCATGGGTCTCTGAGATTAGGGC 
ACCAAATGAAAAGACAAGGATTTGGTTAGGTTCTTACTC 

AGCTTACGATGTTGCACTCTTATGTCTCAAAGGCCCTCAAGCCAATCTCAACTTCCCTAC 
TTCTTCTTCTTCTCATCATCTTCTTGATAATCTCTTAGATGAAAATACCCTTTTGTCCCC 
CAAATCCATCCAAAGAGTAGCTGCTCAAGCTGCC^ 
TTCATCAGCCGTCTCGTCACCGTCCGATCATGAT 

TTTGATGGGATCTTTTGTGGACAATCATGTGTCTTTGATGGATTCAACATCTTCATGGTA 

ACTAAACTCGACGACGATGCTCGATGAATACTTCTACGAAGATGCTGACATTCCGCTTTG 
GAGTTTCAATTAATCCGACGGTCCATAATACATACTTTAATTAGT 

>G2115 Amino Acid Sequence (conserved domain in AA coordinates : 46-115) 
MVKQERKIQTSSTKKEMPLSSSPSSSS 

WVS E IRAPNQKTRI WLGS YS TAEAAARAYDVALIiCLKGPQANLNFPTS S S SHHIiLDNLLD 
ENTIiLS PKS I QRVAAQAANS FNHFAPTS SAVS S PSDHDHHHDDGMQSLMGS FVDNHVSLM 
DSTSSWYDDHNGMFIiFDNGAPFNYSPQLNSTTMLDEYFYEDADIPLWSFN* 
>G2130 (41.. 988) 

CCTCTCTTCATTTTTTAACTCCCTCTCTOTCTCTCTCTCTATGGAGAGACGAACGAGACG 

AGTGAAGTTCACAGAGAATCGTACGGTCACAAACGTAGCAGCTACACCATCTAACGGGTC 

TCCGAGACTGGTCCGTATCACTGTTACTGATCCTTTCGCTACTGACTCGTCTAGCGACGA 

CGACGACAACAACAACGTCACGGTGGTTCCAAGAGTGAAACGATACGTGAAGGAGATTAG 

ATTCTGCCAAGGTGAATCTTCTTCCTCCACCGCGGCGAGGAAAGGTAAGCACAAGGAGGA 

GGAAAGCGTAGTGGTTGAAGATGACGTGTCGACGTCGGTGAAGCCTAAAAAGTACAGAGG 

CGTGAGACAGAGACCTTGGGGAAAATTCGCGGCGGAGATTAGAGATCCGTCGAGCCGTAC 

TCGGATTTGGCTTGGGACTTTTGTCACGGCGGAGGAAGCTGCTATAGCGTACGATAGAGC 

CGCGATTCATCTCAAAGGACCTAAAGCGCTCACGAATTTCCTAACTCCGCCGACGCCAAC 

GCCGGTTATCGATCTCCAAACGGTTTCCGCCTGCGATTACGGTAGAGATTCTCGGCAGAG 

CCTTCATTCACCGACCTCTGTTCTAAGATTCAACGTCAACGAGGAAACAGAGCATGAGAT 

TGAAGCGATCGAGCTATCTCCGGAGAGAAAGTCGACGGTTATAAAAGAAGAAGAAGAATC 

GTCGGCGGGTTTGGTGTTCCCGGATCCGTATCTGTTACCGGATTTATCTCTCGCCGGCGA 

ATGTTTTTGGGATACCGAAATTGCCCCTGACCTTTTGTTTCTCGATGAAGAAACCAAAA^ 

CGAATCAACGTTGTTACCAAACAC^GAGGTTTCGAAACAAGGAGAAAACGAAACTO 

TTTCGAGTTTGGTTTGATTGATGATTTCGAGTCTTCTCC^TGGGATGTGGATCATTTCTT 

CGACCATCATCATCACTCTTTCGATTAAAAATCTCTT^ 
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>G2130 Amino Acid Sequence (domain in AA coordinates 93-160) 
I^RRTRRVKFTENRTVTNVAATPSNGSPRLWITVTDPFATDSSSDDD 
RYVKEIRFCQGESSSSTAARKGKHKEEESVVVEDDVSTSVKPKKYRGVRQRPWGKFAAEI 
RDP S SRTR I WLGTFVTAEEAAI AYDRAAIHLKGP KALTNFLTP PTPTP VIDLQTVS ACD Y 
GRDSRQSLHSPTSVLRFlTSnsrEETEHEIEAIELSPERKSTVIKEEEESSAGLVFPDPYLLP 
DLSIAGECFWDTEIAPDIjLFLDEETKIQSTLLPNTEVSKQGENETEDFEFGLIDDFESSP 
WD VDHFFDHHHHS FD * 
>G2147 (162.. 1262) 

CTGTGATTGTCAAGAGTTTGAACACACAAAGAAGAAAGAAGAACTCAACATTTCAA 

GAAGAAAGAGAGAAGAGAGAAGGTCCAATAATAGAGAGAACAAAAAAAAAGAGAGCTTAA 

TTGTCAGTTTATTCTCTGCAAACGTGCGGCCTAAGTAACACATGTCGAATTATGGAGTTA 

AAGAGCTCACATGGGAAAATGGGCAACTAACCGTTCATGGTCTAGGCGACGAAGTAGAAC 

CAACCACCTCGAATAACCCTATTTGGACTCAAAGTCTCAACGGTTGTGAGACTTTGGAGT 

CTGTGGTTCATCAAGCGGCTCTACAGCAGCCAAGCAAGTTTCAGCTGCAGAGT 

GTCCAAACCACAATTATGAGAGCAAGGATGGATCTTGTTCAAGAAT^CGCGGTTATCCTC 

AAGAAATGGACCGATGGTTCGCTGTTCAAGAGGAGAGCCATAGAGTTGGCCACAGCGTCA 

CTGCAAGTGCGAGTGGTACCAATATGTCTTGGGCGTCTTTTGAATCCGGTCGGAGCTTGA 

AGACAGCTAGAACCGGAGACAGAGACTATTTCCGCTCTGGATCGGAAACTCAAGATACTG 

AAGGAGATGAACAAGAGACAAGAGGAGAAGCAGGTAGATCTAATGGACGACGGGGACGAG 

CAGCAGCGATTCACAACGAGTCCGAAAGGAGACGGCGTGATAGGATAAACCAGAGGATGA 

GAACACTTCAGAAGCTGCTTCCTACTGCAAGTAAGGCGGATAAAGTCT^CAATCTTGGATG 

ATGTTATCGAACACTTGAAACAGCTACAAGCACAAGTACAGTTCATGAGCCTAAGAGCCA 

ACTTGCCACAACAAATGATGATTCCGCAACTACCTCCACCACAGTCAGTTCTCAG 

AAGACCAACAACAACAACAACAACAGCAGCAG CAG CAGCAACAAGAGCAGCAACAGTTTC 

AGATGTCGTTGCTTGCAACAATGGCAAGAATGGGAATGGGAGGTGGTGGAAATGGTTATG 

GAGGTTTAGTTCCTCCTCCTCCTCCTCCACCAATGATGGTCCCTCCTATGGGTAACAGAG 

ACTGCACCAACGGTTCTTCAGCCACATTATCTGATCCATACAGCGC 

CAATGAATATGGATCTCTACAATAAAATGGCAGCAGCTATCTATAGACAACAGTCTGATC 

AAACAACAAAGGTAAATATCGGCATGCCTTCAAGTTCTTCGAATCATGAG 

AGTCTAGCGACCTAGTATTATTGATCCATATATATAGTTCTTGAAAGATTGTTGTATCAT 

GATTGTAAAAACTGTTTTGAGTATGGAAAAAGACTTGCAGATAAAA 

>G2147 Amino Acid Sequence (domain in AA coordinates : 160-234) 

MSNYGVKELTWENGQIjTVHGIjGDBVEPTTSNNP iwtqslngcetles WHQAALQQPS KF 

qlqspngpnhnyeskdgscsrkrgypqeitorwfavqeeshrvot 
esgrslktartgdrdyfrsgsetqdtegdeqetrgeagrsngrrgraaaihneserrrrd 

RINQRMRTLQKLLPTASKADKVS IIiDDVIEHIjKQLQAQVQFMSLRANIiPQQMMI pqlppp 

qsvlsiqhqqqqqqqqqqqqqqqqqfqmsllatmarmgmggggngygglvpppppppmmv 
ppmgnrdctngssatlsdpysaffaqtmnmdlynkmaaaiyrqqsdqttkvnigmpssss 

NHEKRD* 

>G2156 (384.. 1292) 

TTTTTTTTCCCTTTCCTCGTTGAAAAAAAGTACOT 

GCACATGAATTAATTTGAAGCTTCCCTAGAATTCTTTCACATCAATTAATACGACACCGT 

CTCGGGTGAAGAATCTCTCCTCTCTTGCCCTAAAGCGAGTTAGGGTTTAACACACAAAGC 

ATACCCTTTAGATTTGTGTCTCTTAGCTCTGTTTTTGTCGGCTTGTGTAACCGATCAACT 

CAAGCTATTGGCTCCTCACCTCCTGAAATTTGACTTCTCCAATGGATCTCAAAGTTTCTC 

TTATATGAATTCTATCTTCACCCTCACAATATCTTTATATATATGAGCCACAAGAACAAG 

AAGAGTGAGTAGATGCGGCTGCCATGGACGGTGGTTACGATCAATCCGGAGGAGCTTCTA 

GATACTTTCACAAC€TCTT(^GGCCTGAGCT^ 

TTCACCCTTTGCCTGAGCCTCAGCCTCAACC^^ 

AATCTGACTCCAACAAGGATCCGGGTTCCGACCCAGTTACCTCTGGTTCAACCGGGAAAC 
GTCCACGTGGACGTCCTCCGGGATCCAAGAACAAGCCGAAGCCACCGGTGATAGTGACTA 
GAGATAGCCCCAACGTGCTTAGATCTC^TGTTCTTGAAGTCTCATCTGGAGCCGACATAG 
TCGAGAGCGTTACCACTTACGCTCGCAGGAGAGGAAGAGGAGTCTCCATTCTCAGTGGTA 
ACGGCACGGTGGCTAACGTCAGTCTCCGGCAGCCGGCAACGACAGCGGCTCATGGGGCAA 
ATGGTGGAACCGGAGGTGTTGTGGCTCTACATGGAAGGTTTGAGATACTTTCCCTCACAG 
GTACGGTGTTGCCGCCCCCTGCGCCGCCAGGATCCGGTGGTCTTTCTATCTTTCTTTCCG 
GCGTTCAAGGTGAGGTGATTGGAGGAAACGTGGTGGCTCCGCTTGTGGCTTCGGGTCCAG 
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TGATACTAATGGCTGCATCGTTCTCTAATGC2UVCTTTCGAAAGGCTTCCCCTTGAAGATG 
AAGGAGGAGAAGGTGGAGAGGGAGGAGAAGTTGGAGAGGGAGGAGGAGGAGAAGGTGGTC 
CACCGCCGGCCACGTCATCATCACCACCATCTGGAGCCGGTCAAGGACAGTTAAGAGGTA 
ACATGAGTGGTTATGATCAGTTTGCCGGTGATCCTCATTTGCTTGGTTGGGGAGCCGCAG 
CCGCAGCCGCACC^CCAAGACCAGCCTTTTAGAATTGAAAATTATGTCCGTAACATAGCT 
GTAACCAAATTTCATTTCTCAAAATTAAAAGAAAAAAAAAA 

>G2156 Amino Acid Sequence (domain in AA coordinates : 66-86) 

MDGGYDQSGGASRYFHNLFRPELHHQLQPQPQLHPIiPQPQPQPQPQQQNSDDESDSNKDP 

GSDPVTSGSTGKRPRGRPPGSKNKPKPPVIVTRDSPNVLRSHVLEVSSGADIVESVTTYA 

RI^GRGVSILSGNGTVANVSLRQPATTAAHGANGGTGGWAI^GRFEILSLTGTVLPPPA 

PPGSGGLSIFLSGVQGQVIGGNWAPLVASGPVIIiMAASFSNATFERLPLEDEGGEGGEG 

GEVGEGGGGEGGPPPATSSSPPSGAGQGQLRGNMSGYDQFAGDPHLLGWGAAAAAAPPRP 

AF* 

>G2294 (24.. 659) 

TCCTCCCTTAATTAGTATCZAAA7\ATGGTGAAAACACTTCAAAAGACACCAAAGAGAATGT 
CATCTCCATCATCATCATCTTCATC^TCC^ 

AGAAGTACAAGGGAGTGAGAATGAGAAGTTGGGGTTCATGGGTTTCAGAGATCAGAGCTC 
CTAATCAAAAGACAAGGATCTGGCTTGGTTCTTACTCAACTGCTGAAGCCGCGGCTAGAG 
COTACGACGCAGCACTCCTATGTCTTAAAGGATCCTCAGCTAATAATCTC^CTTCCC^^ 
AGATCTCAACTTCTCTCTACCATATTATCAACAATGGTGATAACAACAATGACATGTCCC 
CTAAGTCTATACAAAGAGTAGCAGCTGCAGCTGCTGCTGCCAACACAGATCCTTCCTCAT 
CATCAGTCTCTACTTCATCTCCATTGCTTTCCTCTCCATCTGAAGATCrCTATGATGTTG 
TCTCCATGTCACAGTATGACCAACAAGTCTCCTTGTCT 

GCTTTGATGGTGATGATCAGTTCATGTTCATTAATGGAGTCTCCGCGCCGTATTTGACAA 
CATCACTTTCTGATGATTTCTTTGAGGAAGGAGATATCAGATTATGGAACTTCTGCTGAT 
TCTACTTTCATTATACCTTATTCTTTG 

>G2294 Amino Acid Sequence (conserved domain in AA coordinates : 32-102) 
MVKTLQKTPKRMSSPSSSSSSSSSTSSSSIRMK^ 

I»GS YSTAEAAARAYDAALLCLKGS SANNLNFPB I STSLYHI INNGDNNNDMS PKS IQRVA 
AAAAAANTDPSSSSVSTSSPIiLSSPSEDLYDWSMSQYDQQVSLSESSSWYNCFDGDDQF 
MF INGVS AP YLTTSLSDDFFEEGD IRLWNFC * 
>G2510 (16.. 594) 

ATAACAAACTCTTTAATGTCACCACAGAGAATGAAGCTATCATCACCACCAGTTACCAAC 
AACGAACCAACCGCCACCGCTTCTGCCGTTAAATCTTGCGGCGGAGGAGGTAAAGAAACC 
AGCTCATCGACCACGAGGC^TCCAGTGTACCA^ 

TGGGTTTCTGAGATCAGAGAGCCCCGGAAAAAGTCTCGGATTTGGCTCGGATCTTTTCCG 
GTGCCGGAGATGGCTGCTAAGGCCTACGACGTGGCAGCGTTTTGTCTAAAAGGTAGAAAA 
GCTCAGCTGAATTTCCCTGAAGAAATCGAGGATCTACCTCGACCGTCCACGTGTACTCCC 
AGAGATATCCAAGTCGCAGCGGCCAAAGCAGCGA 

GATGATGACGTGGCAGGAATAGACGACGGAGATGATTTCTGGGAAGGCATTGAGCTGCCT 
GAGCTTATGATGAGTGGAGGTGGGTGGTCGCCGGAGCCTTTTGTTGCCGGAGATGATGCC 
ACGTGGCTTGTCGACGGAGACTTGTATO^GTATCAGTTCATGGCGTGTCTGTGAGTGTTG 
CTGTCGATTGTGTCGTATTCGTTATACGTGTACGTTGTATCGTTATTGTGTTGGCTCACT 
TAATTTAATGCATATGCATGTATATTTTC^TTTATTTGTTTCTAGTTTATTGTTTTACGC 
GATTAATAATTAGATACCTGTTTCTCAAGTTAGTTATCAGGTTTGTACGCATCTACAAAA 
ATACGTATAAGTGTATGTTCTTATATACAGTTTTTGTTTGCATAAGTATTGCTACTTATT 
CTAAAAAAAAAAAAAAAAAAAAAAAAA 

>G2510 Amino Acid ' Sequence (conserved domain in AA coordinates : 41-108) 
MS PQRMKLS S PPVTNNEPTATASAVKS CGGGGKETSS STTRHPVYHGVRKRRWGKWVSE I 
REPRKKSRI WIiGS FPYPEMAAKAYDVAAFCLKGRKAQLNFPEEIEDLPRPSTCTPRD IQV 
AAAKAANAVKI I KMGDDDVAG IDDGDDFWEGIELPELMMSGGGWS PE PFVAGDDATWLVD 
GDLYQYQFMACL* 
>G2893 (130.. 981) 

AAATCATAAAAGCCTCTCTCTTAGTCTATTTTTATCTCACGGCTCTCTCTCCCCTCTCTA 
CACACACAAACACAAATAAAGCGTAAAACTGAA 

CATATTAATATGTCAAATATAACAAAGAAGAAGTGTAATGGAAATGAAGAGGGTGCAGAG 
CAGAGGAAAGGGCCTTGGACACTCGAGGAAGACACTCTTCTCACCAATTACATTTCCCAT 
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AACGGTGAAGGCCGATGGAATCTGCTCGCTAAATCTTCTGGGCTAAAGAGAGCAGGAAAA 

AGTTGTAGATTGAGATGGTTGAATTACCTTAAACCCGACATAAAGCGTGGGAATCTCACT 

CCTCAAGAACAACTTTTAATCCTTGAGCTCCATTCTAAATGGGGTAATAGGTGGTCAAAA 

ATTTCGAAGTATTTACCAGGAAGAACAGACAACGATATCAAAAACTACTGGAGAACTAGA 

GTCCAGAAACAAGCACGCCAGCTCAACATAGATTCCAATAGCCACAAGTTC^ 

GTTCGTAGCTTTTGGTTTCCAAGACTGATCAACGAGATTAAAGACAACTCATACACCAAC 

AATATTAAAGCTAATGCTCCTGATTTACTTGGACCAATTTTACGAGACAGCAAAGATTTG 

GGTTTCAACAACATGGATTGTTCCACTTC 

TTCATGGATTTTTCTGATCTTGAAACCACAATGTCCTTGGAAGGATCACGAGGGGGTAGT 
AGTCAATGTGTGAGTGAGGTTTATAGCTCCTTCCCTTGCCTAGAGGAGGAGTACATGGTG 
GCCGTTATGGGCAGTTC^GAGATTTCAGCAT^ 

TACGAGGATGATGTGACACAAGATCTAATGTGGAACATGGATGACATTTGGCAGTTTAAC 

GAGTATGCACACTTTAATTAGGTTATATTATATTTATGT^ 

TTTATCGGTCTTTTATTAAATTTTGATTG 

TAGTTTTTAATGAAAAAAATGTTTAAGCGCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

>G2893 Amino Acid Sequence (conserved domain in AA coordinates : 19-120) 

MSNI TKKKCNGNEEGAEQRKGPWTIiEEDTLLTNYI SHNGEGRWNIiIiAKS SGLKRAGKSCR 

LRWLNYLKPD I KRGNIjTPQEQLLILEIiHSKWGNRWS KI SKYLPGRTDND I KNYWRTRVQK 

QARQIjNIDSNSHKF IEWRS FWFPRL INE IKDNS YTNNI KANAPDliLGP ILRDSKDLGFN 

NMDCSTSMSEDLKKTSQFMDFSDIjETTMSLEGSRGG 

GSSDISALHDCHVADSKYEDDVTQDLMWNMDDIWQFNEYAHFN* 

>G340 (97. .834) 

ATGAAATCTCTGTAGTTTTTTTTTGTTCCTTTCTTAAATTTCGAAAGAAAGACATTTATT 

AAACG&AAATAACTCTTTAGATCATTGCAAGGAAAAATGT 

GCATTCTACGATATCGGAGAGCAGCAATACTCTACTTTCGGGTACATTTTA 

GGGAACGCAGGAGCTTACGAGATTGACCCTTCGATCCCAAACATCGACGATGCGATCTAC 

GGCTCAGATGAGTTCCGTATGTACGCTTACAAAATCAAACGGTGTCCtCGTACTCGTAGC 

CACGACTGGACGGAGTGTCCCTACGCTCACCGTGGCGAGAAAGCCACACGCCGTGATCCT 

CGCCGTTACACTTACTGTGCAGTCGCATGCCCGGCTTTCCGAAATGGCGCATGCCACCGT 

GGCGACTCATGCGAATTCGCACATGGCGTATTCGAGTACTGGCTCC^CCCGGCGCGTTAC 

CGAACACGCGCATGTAACGCCGGGAACTTGTGTCAGAGGAAAGTGTGTTTCTTTGCCCAC 

GCGCCGGAGCAGCTAAGGCAGTCTGAAGGAAAGCACAGGTGCAGGTACGCATATAGGCCG 

GTGAGGGCTAGAGGTGGTGGAAACGGCGATGGAGTGACGATGAGAATGGACGACGAGGGT 

TACGACACGTCACGGTCTCCGGTGAGAAGCGGGAAAGATGATTTAGATAGT2^ACGAGGAG 

AAGGTGTTGTTGAAGTGTTGGAGTCGGATGAGCATTGTGGATGATCATTATGAGCCGTCC 

GATTTGGATTTGGATTTGTCACACTTTGATTGGATCTCAGAGTTGGTCGATTAAATTTGG 

GAAATCAAAGCAGAGAACAAAAGAAACCCGATAAATAAAGTGGATTTTGTTAAAATCCAC 

AAGATCAAGATTCAAGATGAGAGATCTTGTCATGTATATGGTAAATTTAATTGTAATGAT 

TTATTGCAATGTCGCAAAAGAAGTTACTTCTCTTTGCATGTAAACAGATTCTTGATCTTC 

TATAAGTCTTTGTATTAA 

>G340 Amino Acid Sequence (domain in AA coordinates: 37-154) 
MLKSASPMAFYDIGEQQYSTFGYILSKPGNAGAYEIDPSIPNIDDAIYGSDEFRMYAYKI 
KRCPRTRSHDWTECPYAHRGEKATRRDPRRYTYCAVA^ 

YWLHPARYRTRACZN"AGNLCQRKVCFFAHAPEQLRQSEGKHRCRYAYRPVRARGGGNGDGV 
TMRMDDEGYDTSRS PVRSGKDDIjDSNEEKvTjLKCWSRMS I VDDHYEPSDLDLDLSHFDWI 
SELVD* 

>G39 (75. .638) 

GTTTCC^CAGTCCCTCTACTTGTGC^TAAAACTGTAAAAC^CTACTCTGAAAATTTTGCT 
TCTGTTAGGATATAATGCCACCCTCTCCTCCTAAATCTCCTTTTATTAGCTCTTCACTCA 
AAGGAGCTCATGAAGATCGCAAATTTAAATGCTATAGGGGTGTCCGAAAGAGGTCTTGGG 
GCAAATGGGTGTCTGAAATCAGAGTTCCAAAGACTGGACGACGAATATGGCTAGGTTCAT 
ACGATGCTCC^GAGAAGGCAGCTAGAGCCTATGATGCTGCTTTGTTCTGTATTAGGGGTG 
AGAAGGGAGTTTACAATTTTCCCACTGATAAAAAGCCGCAGCTTCCAGAAGGTTCTGTCC 
GGCCTCTGTCCAAGCTCGACATACAGACAATAGCAACAAACTATGCTTCATCAGTTGTGC 
ATGTACCTTCCCATGCCACCACACTCCCGGCAACAACCCAGGTTCCCTCTGAAGTTCCTG 
CTTCCTCTGATGTTTCTGCTTCTACTGAGATTACAGAGATGGTCGATGAATATTATCTCC 
CAACCGATGCAACTGCAGAATCAATATTCTCAGTTGAAGACTTACAACTGGACAGTTTCC 



26 



WO 03/013227 PCT/US02/25805 

27/286 



TCATGATGGACATTGATTGGATAAACAATCTAATCTGATGTGTAACGTCACTTGCAGTGA 
CATTTAATATGGTTTANCTATCAGTTACCTGTCTGCTTCTTGTAAGGGTATACTTGGATC 
CTTGTCTTTGAACTTGTTTTATTTAGCATGCAAA 

>G39 Amino Acid Sequence (domain in AA coordinates: 24-90) 
MPPSPPKSPFISSSLKGAHEDRKFKCYRGVRKRSWGKWVSEIRVPCT^ 
KAARAYDAALFCIRGEKGVYNFPTDKKPQLPEGSTOPLSKLDIQTIATNYASSV^ 
ATTIiPATTQVPSEVPASSDVSASTEITEMVDEYYLPTO^ 

DWINNIiI* 

>G439 (128.. 967) 

TATAAATCTTCGTTTCTACTTTTTTTTCTTCCATAATATAGTCAATTCGTTTTCTTAATT 
AGGGCTTCTTCTCTTTGTTTCTCCAATC^ 

TATACAAATGGCAATGGCTTTAAACATGAATGCTTACGTAGACGAGTTCATGG 

TGAACCATTCATGAAGGTAACTTCATCTTCTTCTACTTCGAATTCATCAAATCCAAAACC 

ATTAACTCCTAATTTCATCCCTAATAATGACCAAGTCTTACCGGTATCTAACCAAACCGG 

TCCGATTGGGCTAAACCAGCTCACTCCAACACAAATCCTCCAAATTCAGACAGAGTTACA 

TCTCCGGCAAAACCAATCTCGTCGTCGCGCTGGTAGTCATCTTCTCACCGCTAAACCAAC 

OTCAATGAAGAAAATCGACGTAGCAACTAAACCGGTTAAACTATACCGAGGCGTAAGACA 

GAGGCAATGGGGTAAATGGGTAGCTGAGATTCGGCTACCTAAAAACCGAACCCGGTTATG 

GCTCGGTACGTTCGAAACGGCTCAAGAAGCTGCATTAGCTTACGATCAAGCAGCTC^ 

GAT(^GAGGAGAC^CGCTCGTCTCAATTTCCC^GAC^TTGTTTOTCAAGGA<^CTATAA 

ACAGATATTGTCTCCGTCTATCAACGCZAAAGATCGAATCCATCTGCAATAGTTCTGATCT 

TCCACTGCCTCAGATCGAGAAACZAGAACAAAACAGAGGAGGTGCTCTCTGGTTTTTCCAA 

ACCGGAGAAAGAACCGGAATTTGGGGAGATATACGGATGCGGATACTCGGGCTCATCTCC 

TGAGTCGGATATAACGTTGTTGGATTTCTCAAGCGACTGTGTGAAAGAAGATGAGAGTTT 

CTTGATGGGTTTGCACAAGTATCCTTCTTTGGAGATTGATTGGGACGCTATAGAGAAACT 

CTTCTGAATCCATTTTATCTTTTTGATTCATTTGTCTCTAAATTGTAGAATTTTATTTTC 

AGAGCTTTGTAAGGGAAGTTCTTGAATGAGAGTTGCAGAGGACTAGTGGAACCTAACTCT 

GTTTTCTTTTGTAAGTATTGTTTATAATGGGCCGTTGAATGGGCCTTATTGATTTAAACA 

GCCCAAGTTTTTAAAAAAAAAAAAAAAAAAAAAAAAA 

>G439 Amino Acid Sequence (domain in AA coordinates: 110-177) 
MAMAIJtmNAYVDEFMEAIiEPFMK^ 

GIiNQLTPTQ ILQIQTEUHjRQNQS RRRAGSHLIjTAKPTSMKKIDVATKPVKIjYRGVRQRQ 
WGKWVAEIRLPKNRTRLWLGTFETAQEAAIjAYDQAAHKIRGDNARIjNFPDIVRQ 

lspsinakiesicnssdlplpqiekqnkteevlsgfskpekepefgeiygcgysgsspes 

DITIjLDFSSDCVKEDESFIiMGIjHKYPSIjEIDWDAIEKIjF* 
>G470 (1..2580) 

atggcgagttcggaggtttcaatgaaaggtaatcgtggaggagataacttctcctcctct 
ggttttagtgaccctaaggagactagaaatgtctccgtcgccggcgaggggcaaaaaagt 
aattctacccgatccgctgcggctgagcgtgctttggaccctgaggctgctctttacaga 
gagctatggcacgcttgtgctggtccgcttgtgacggttcctagacaagacgaccgagtc 
ttctattttcctcaaggacacatcgagcaggtggaggcttcgacgaaccaggcggcagaa 
caacagatgcctctctatgatcttccgtcaaagcttctctgtcgagttattaatgtagat 

TTAAAGGCAGAGGCAGATACAGATGAAGTTTATGCGCAGATTACTCTTCTTCCTGAGGCT 
AATCAAGACGAGAATGCAATTGAGAAAGAAGCGCCTCTTCCTCCACCTCCGAGGTTCCAG 
GTGCATTCGTTCTGCAAAACCTTGACTGCATCCGACACAAGTACACA 

GTTCTTAGGCGACATGCGGATGAATGTCTCCCACCTCTGGATATGTCTCGACAGCCTCCC 
ACTCAAGAGTTAGTTGCAAAGGATTTGCATGCAAATGAGTGGCGATTCAGACATATATTC 
CGGGGTCAACCACG6AGGCATTTGCTACAGAGTGGGTGGAGTGTGTTTGTTAGCTCCAAA 
AGGCTAGTTGCAGGCGATGCGTTTATATTTCTAAGGGGCGAGAATGGAGAATTAAGAGTT 
GGTGTAAGGCGTGCGATGCGACAACAAGGAAACGTGCCGTCTTCTGTTATATCTAGCCAT 
AGCATGCATCTTGGAGTACTGGCCACCGCATGGCATGCCATTTCAACAGGGACTATGTTT 
ACAGTCTACTACAAACCCAGGACGAGCCCATCTGAGTTTATTGTTCCGTTCGATCAGTAT 
ATGGAGTCTGTTAAGAATAACTACTCTATTGGCATGAGATTCAAAATGAGATTTGAAGGC 
GAAGAGGCTCCTGAG(^GAGGTTTACTGGCACAATCGTTGGGATTGAAGAGTCTGATCCT 
ACTAGGTGGCCAAAATCAAAGTGGAGATCCCTCAAGGTGAGATGGGATGAGACTTCTAGT 
ATTCCTCGACCTGATAGAGTATCTCCGTGGAAAGTAGAGCCAGCTCTTGCTCCTCCTGCT 
TTGAGTCCTGTTCCAATGCCTAGGCCTAAGAGGCCCAGATCAAATATAGCACCTTCATCT 
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CCTGACTCTTCGATGCTTACCAGAGAAGGTACAACTAAGGCAAACATGGACCCTTTACCA 
GCAAGCGGACTTTCAAGGGTCTTGCAAGGTCAAG 

ACTGAGAGTGTAGAGTGTGATGCTCCTGAGAATTCTGTTGTCTGGCAATCTTCAGCGGAT 
GATGATAAGGTTGACGTGGTTTCGGGTTCTAGAAGATATGGATCTGAGAACTGGATGTCC 
TCAGCCAGGCATGAACCTACTTAC^CAGATTTGCTCTCCGGCTTTGGGACTAACATAGAT 
CCATCCCATGGTCAGCGGATACCTTTTTATGACCA 

AAGAGAATCTTGAGTGATTCAGAAGGCAAGTTCGATTATCTTGCTAACCAGTGGCAGATG 

ATACACTCTGGTCTCTCCCTGAAGTTACATGAATCTCCTAAGGTACCTGCAGCAACTGAT 

GCGTCTCTCCAAGGGCGATGCAATGTTAAATACAGCGAATATCCTGTTCTTAATGGTCTA 

TCGACTGAGAATGCTGGTGGTAACTGGCCAATACGTCCACGTGCTTTGAATTATTATGAG 

GAAGTGGTCAATGCTCAAGCGCAAGCTCAGGCTAGGGAGCAAGTAACAAAACAACCCTTC 

ACGATACAAGAGGAGACAGCAAAGTCAAGAGAAGGGAACTGC^GGCTC 

CTGACCAACAACATGAATGGGACAGACTCAACCATGTCTCAGAGAAACAACTTGAATGAT 

GCTGCGGGGCTTACAGAGATAGCATCACGAAAGGT^ 

GGGTCAAAATCAACAAACGATCATCGTGAACAGGGAAGACCATTCCAGACTAATAATCCT 
CATCCGAAGGATGCTCAAACGAAAACCAACTCAAGTAG<^ 

CAGGGAATTGC^CTTGGCCGTTCAGTGGATCTTTCAAAGTTCCAAAACTATGAGG 

GTCGCTGAGCTGGACAGGCTGTTTGAGTTCAATGGAGAGTTGATGGCTCCTAAGAAAGAT 

TGGTTGATAGTTTACACAGATGAAGAGAATGATATGATGCTTGTTGGTGACGATCCTTGG 

mGGAGTTTTGTTGCATGGTTCGCAAAATCTTCATATACACGAAAGAGGAAGTGAGGAAG 

ATGAACCCGGGGACTTTAAGCTGTAGGAGCGAGGAAGAAGCAGTTGTTGGGGAAGGATCA 

GATGCAAAGGACGCCAAGTCTGCATC^AATCCTTCATTGTCCAGCGCTGGGAACTCTTAA 

>G470 Amino Acid Sequence (domain in AA coordinates: 61-393) 

MASSEVSMKGNRGGDNFSSSGFSDPKETRNVSVAGE 

ELWHACAGPLVTVPRQDDRVFYFPQGHIEQVEASTOQAAEQQMPLYDIiPSKLLCRVIN^ 
LKAEADTDEVYAQITLLPEANQDEt^IEKEAPLPPPPRFQVHSFCKTLTASDTSTHGGFS 
VLRRHADECLPPLDMSRQPPTQELVAKDLH^ 

RLVAGDAFIFIjRGENGELRVGVRRAMRQQGNVPS SVI S SHSMHIiGVXiATAWHAI STGTMF 

TVYYKPRTSPSEFIVPFDQYMESVKNliySIGMRPKMRFEGEEAPEQRFTGTIVGIEESDP 

TRWPKSKWRSLKVRWDETSSIPRPDRVSPWKVrePAIAPPAL^ 

PDSSMLTREGTTKAI^DPLPASGIiSRVLQGQEYSTLRTKHTESVECDAPE^ 

DDKVDVVSGSRRYGSENWMSSARHEPTYTDLLSGFGTNIDPSHGQRIPFYDHSSSPSMPA 

KRILSDSEGKFDYLANQWQMIHSGIiSIiKLHESPKVPAATDASLQGROJT^^ 

STENAGGNWPIRPRALNYYEEVVNAQAQAQAREQVTKQPFTIQEETAKSREGNCRLFGIP 

LTNNMNGTDSTMS QRNNIjNDAAGI»TQIAS PKVQDL SDQS KGSKSTNDHREQGRPFQTNNP 

HPKDAQTKTNS SRS CTKVHKQGIALGRSVDLS KFQNYEELVAEIiDRLFEFNGEIiMAPKKD 

WLIVYTDEEND1WLVGDDPWQEFCCMVRKIFIYTKEEVRKMNPGTLSCRSEEEAVV 

DAKDAKS ASNPSLS S AGNS * 

>G652 (1..606) 

atgagcggaggaggagacgtgaacatgagtggtggagacagacgcaagggaacggtgaag 
tggtttgatacacagaaggggtttggtttcatcacacctagcgacggtggtgacgatctc 
ttcgttcaccagtcttccatcagatctgaaggatttcgtagcctcgcagctgaggaatct 
gttgagttcgacgttgaggttgacaactccggccgtcccaaggctattgaagtgtctgga 
cccgacggtgctcccgttcagggtaacagcggtggtggtggttcatctggtggacgcggt 
ggttttggcggcggtggtggaagaggagggggacgtggtggaggaagctacggaggaggt 
tatggtggaagaggaagcggtggccgtggaggaggtggtggtgataattcttgctttaag 
tgcggtgaaccaggtcacatggcgagagaatgctctcaaggtggtggaggatacagcgga 
ggcgggggtggtggaaggtacgggtctggcggcggcggaggaggaggtggtggtggctta 
agctgctacagctgtggagagtctgggcactttgcaagggattgcactagcggtggtgct 
cgttga 

>G652 Amino Acid Sequence (domain in AA coordinates : 28-49, 137-151, 182-196) 

MSGGGDVl^SGGDRRKGTVKWFDTQKGFGFITPSDGGDDLFVHQSSIRSEGFRSIiAAEES 

VEFDVEVDNSGRPKAI EVSGPDGAPVQGNSGGGGS SGGRGGFGGGGGRGGGRGGGS YGGG 

YGGRGSGGRGGGGGDNSCFKCGEPGHMARECSQGGGGYSGGGGGGRYGSGGGGGGGGGGL 

SCYSCGESGHFARDCTSGGAR* 

>G671 (61.- 1119) 

TTCACTTGAGAACAACCCCCTTTGAACTCGATCAAGAAAGCTAAGTTTGAAGAATCAAGA 
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ATGGTGCGGACACCGTGTTGCAAAGCCGAACTAGGGTTAAAGAAAGGAGCTTGGACTCCC 

GAGGAAGATCAGAAGCTTCTCTCTTACCTTAACCGCCACGGTGAAGGTGGATGGCGAACT 

CTCCCCGAAAAAGCTGGACTCAAGAGATGCGGCAAAAGCTGCAGACTGAGATGGGCCAAT 

TATCTTAGACCTGACATCAAAAGAGGAGAGTTCACTGAAGACGAAGAACGTTCAATCATC 

TCTCTTCACGCCCTTCACGGCAACAAATGGTCTGCTATAGCTCGTGGACTACCAGGAAGA 

ACCGATAACGAGATCAAGAACTACTGGAACACTCATATCAAAAAACGTTTGATCAAGAAA 

GGTATTGATCCAGTTACACACAAGGGCATAACCTCCGGTACCGACAAATCAGAAAACCTC 

CCGGAGAAACAAAATGTTAATCTGACAACTAGTGACCATGATCTTGATAATGACT^AGGCG 

AAGAAGAACAACAAGAATTTTGGATTATCATCGGCTAGTTTCTTGAACAAAGTAGCTAAT 

AGGTTCGGAAAGAGAATCAATCAGAGTGTTCTGTCTGAGATTATCGGAAGTGGAGGCCCA 

CTTGCTTCTACTAGTCACACTACTAATACrACAACTACAAGTGTTTCCGTTGACT 

TCAGTTAAGTCAACGAGTTCTTCCTTCGCACCAACCTCGAATCTTCTCTGCCATGGGACC 

GTTGCAACAACACCAGTTTCATCGAACTTTGACGTTGATGGTAACGTTAATCTGACGTGT 

TCTTCGTCCACGTTCTCTGATTCCTCCGTTAACAATCCTCTAATGTACTGCGATAATTTC 

GTTGGTAATAACAACGTTGATGATGAGGATACTATCGGGTTCTCCACATTTCTGAATGAT 

GAAGATTTCATGATGTTGGAGGAGTCTTGTC^ 

ACGAGGTTTCTTCACGAGGATGAAAACGACGTCGTTGATGTGACGCCGGTCTATGAACGT 
CAAGACTTGTTTGACGAAATTGATAACTATTTTGGATGAGTGAAACTCATAATCGATGAA 
TCCCACGTGACCATGTCAATATGATGTCTATGGATATGTXACCTTGATGATGTTGATGGT 
AATAATAATAAATAATAGATGGTGATGATGACCATGCATGAATCATGAATGTAGTTCGTG 
TTGTCACATATGCTTGTGTTTTTGTGT^ 

TGTAAATGGATTATAAATGGTGATGTAATAATTATAATGTTAAAAAAAAAAAAAAAAAAA 
AAAA 

>G671 Amino Acid Sequence (domain in AA coordinates: 15-115) 

MVRTPCCKAELGLKKGAWTPEEDQK 

YLRPDIKRGEFTEDEERSIISLHALHG^ 

GIDPVTHKGITSGTDKSEI^PEKQITVWIjTT 

RFGKRINQS VLSE I IGSGGPLASTSHTTNTTTTS VS VD SE S VKSTS S S FAPTSNIiLCHGT 
VATTPVS SNFDVDGNVNLTC S S STFSDS SVNNPLMY CDNFVGNNNVDDEDT IGFSTFLND 
EDFMMLEESCVENTAFMKEIiTRFLHED IDNYFG* 
>G779 (110.. 712) 

GACATGCATGTAAGCATTCGGTTAATTAATCGAGTCAAAGATATATATCAGTAAATACAT 

ATGTGTATATTTCTGGAAAAAGAATATATATATTGAGAAATAAGAAAAGATGAAAATGGA 

AAATGGTATGTATAAAAAGAAAGGAGTGTGCGACTCTTGTGTCTCGTCCAAAAGCAGATC 

CAACCACAGCCCCAAAAGAAGCATGATGGAGCCTCAGCCTCACCATCTCCTC^ 

GAACZy^AGCTAATGATCTTCTCACACAAGAACACGCAGCTTTTCTCAATGATCCTCACCA 

TCTCATGTTAGATCCACCTCCCGAAACCCTAATTCACTTGGACGAAGACGAAGAGTACGA 

TGAAGACATGGATGCGATGAAGGAGATGCAGTACATGATCGCCGTCATGCAGCCCGTAGA 

CATCGACCCTGCCACGGTCCCTAAGCCGAACCGCCGTAACGTAAGGATAAGCGACGATCC 

TCAGACGGTGGTTGCTCGTCGGCGTCGGGA7UVGGATCAGCGAGAAGATCCGAATTCTCAA 

GAGGATCGTGCCTGGTGGTGCGAAGATGGACACAGCTTCCATGCTCGACGAAGCCATACG 

TTACACCAAGTTCTTGAAACGGCAGGTGAGGAT^ 

TCCTATGGCTAACCCCTCTTACCTTTGTTATTACCACAACTCCCAACCCTGATGAACTAC 

ACAGAAGCTCGCTAGCTAGACATTTGGTGTCATCCTCTCAACCTTT 

>G779 Amino Acid Sequence (domain in AA coordinates: 126-182) 

MKMENGMYKKKGVCDS C VS SKSRS 

DPHHLMLDPPPETLIHLDEDEEYDEDMDAMK^ 

SDDPQTWARRRRERISEKIRILKRIVPGGAKMDTASM^ 

QIGAPMANPS YLCYYHNSQP * 

>G962 (148.. 1392) 

CGTCGACTCTCTACTCAACACCACTCAATTTCATCTCTCTTTTTCCCTTCCATTGTTAGT 
ATAAAAACCAAGCAAACCCTTAATCACTTTTC^TCATCATATATCACCTTAATCCACATG 
CATACACATATCTAGTCTTTTTGATATATGGCAATTGTATCCTCCACAACAAGCATCATT 
CCCATGAGTAACCAAGTCAACAATAACGAAAAAGGTATAGAAGACAATGATCATAGAGGC 
GGCCAAGAGAGTCATGTCCAAAATGAAGATGAAGCTGATGATCATGATCATGACATGGTC 
ATGCCCGGATTTAGATTCCATCCTACCGAAGAAGAACTCATAGAGTTTTACCTTCGCCGA 
AAAGTTGAAGGCAAACGCTTTAATGTAGAACTCATCACTTTCCTCGATCTTTATCGCTAT 
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GATCCTTGGGAACTTCCTGCTATGGCGGCGATAGGAGAGAAAGAGTGGTACTTCTATGTG 

CCAAGAGATCGGAAATATAGAAATGGAGATAGACCGAACCGAGTAACGACTTCAGGATAT 

TGGAAAGCCACCGGAGCTGATAGGATGATCAGATCGGAGACTTCTCGGCCTATCGGATTA 

AAGAAAACCCTAGTTTTCTACTCTGGTAAAGCCCCTAAAGGCIACTCGTACTAGTTGGATC 

ATGAACGAGTATCGTCTTCCGCACCATGAAACCGAGAAGTACCAAAAGGCTGAAATATCA 

TTGTGCCGAGTGTACAAAAGGCCAGGAGTAGAAGATCATCCATCGGTACCACGTTCTCTC 

TCCA.CAAGACATCATAACCATAACTCATCGACATCATCCCGTTTAGCCTTA 

CAACACCATTCATCCTCCTCTAATCATTCCGACAACA^ 

AACAATCTCGAGAAGCTCTCCACCGAATATTCCGGCGACGGCAGCACAACAACAACGACC 

ACAAACAGTAACTCTGACGTTACCATTGCTCTAGCCAATCAAAACATATATCGTCCAATG 

CCTTACGACACAAGCAACAACACATTGATAGTCTCTACGAGAAATCATCAAG 

GAAACTGCCATTGTTGACGATCTTCAAAGACTAGTTAACTACCAAATATCAGATGGAGGT 

AACATCAATCACCAATACTTTCAAATTGCTC^CAGTTTCAT 

GCTAACGCAAACGCATTACAATTGGTGGCTGCGGCGACTACAGCGACAACGCTAATGCCT 

CAAACTCAAGCGGCGTTAGCTATGAACATGATTCCTGCAGGAACGAT^ 

TTGTGGGATATGTGGAATCCAATAGTACCAGATGGAAACAGAGATCACTATACTAATATT 

CCTTTTAAGTAATTTAATTAGATCATGATT 

GCGC 

>G962 Amino Acid Sequence (domain in AA coordinates: 53-175) 
MAIVSSTTS 1 1 PMSNQVNNNEKG IKDNDHRGGQESHVQNEDEADDHDHDMVMPGFRFHPT 
EEELIEFYIiRRKVEGKRFNVELITFLDLYRYDPWEL^ 

DRPNRVTTSGYWKATGADRM IRSETSRP IGLKKTLVFYSGKAPKGTRTS W IMNEYRLPHH 
ETEKYQKAE I SLC^VYKRPGVEDHPS VPRSLSTRHHNHNS STS SRLALRQQQHHS S SSNH 
SDNNIiNNNNNINNLEKLSTEYSGDGSTTTTT^^ 
IVSTRl^QDDDETAIVDDLQRIiVNYQISDGGN^ 

AAATTATTLMPQTQAALAMNMI PAGTIPNNALWDMWJTPI VPDGNRDHYTNIPFK* 
>G977 (46. .591) 

CACCAAA.CTCACCTGAAA.CC CTATTTCCATTTACCATTCAC^ TAATGGCACGACCACAA 
CAACGCTTTCGAGGCGTTAGACAGAGGCATTGGGGCTCTTGGGTCTCCGAAATTCGTCAC 
CCTCTCTTGAAAACAAGAATCTGGCTAGGGACGTTTC 

GCCTACGACGAGGCGGCTAGGCTAATGTGTGGCCCGAGAGCTCGTACTAATTTCCCATAC 

AACCCTAATGCCATTCCTACTTCCTCTTCCAAGCTTCTATCAGCAACTCTTACCGCT 

CTCCACAAATGCTACATGGCTTCTCTTCAAATGACCAAGCAAACGCi^ 

ACGCAGACCGCAAGATCACAATCCGCGGACAGTGACGGTGTGACGGCTAACGAAAGTCAT 

TTGAACAGAGGAGTAACGGAGACGACAGAGATCAAGTGGGAAGATGGAAATGCGAATATG 

CAACAGAATTTTAGGCCATTGGAGGAAGATCATATCGAGCAAA 

CACTACGGTTCCATTGAGCTTTGCTCTGTTTTACCAACTCAGACGCTGTGAGAAATGGC 
TTGTCGTTTTAGCGTATTCTTTTCATTTTTATTTTTGTO 

GTGATGAGAGTAGTAGTGAGAGAAGGCTAATTTCAAGACATTTTGATCTGAATTGGCCTC 

TTTTGAAACACTGATTCTAGTTTCTATAAGAGCAATCGATCATATGCTATGTTATGTATA 

GTATTATAAAAAAATGTTATTTTCTGATTNAAAAAAAAAAAAAAAAAAAAAAA 

>G977 Amino Acid Sequence (domain in AA coordinates: 5-72) 

MARPQQRFRGTOQRHWGSWVSEIRHPLLKTRIWLGTFETAEDAARAYDEAARLMCGPRAR 

TNFPYNPNAIPTSSSKLLSATLTAKLHK^ 

ANESHLNRGVTETTE I KWEDGNANMQQNFRPLEEDHIEQMIEELLHYGS IELCSVLPTQT 
Ii* 

>G1063 (241.. 966) 

Gl^AAAGAAGATGGATGGGCCACAAGTTGCTATATAAATCCTTCC^CTTCTTGTTGTATA 
CTATTGCTTGAGTTCTGATTGGGCACAGTAGTACCATTGCCATTTCTCTCACAC^T 
TCTCITTCTCTC^TCATGAATC^TC^^^ 
GTAAGCTTTTCACCAGTTTCTCTCCATACCCATTTTAT 

ATGGATTCTGACATAATGAACATGATGATGCATCAGATGGAGAAGCTTCCTGAGTTTTGT 
AACCCTAATTCCTCTTTCTTCTCTCCCGACCACAACAACACTTACCCTTTTCTCTTTAAC 
TCCACTCATTACCAGTCCGATCACTCAATGACCAACGAACCA^ 

GGTTTACTCACTAACCCTTCTTCTATCTCTCCCAACACAGCTTACTCTTCCGTTTTTCT^ 
GACAAAAGAAACAACAGTAACAACAACAATAATGGCACGAAI^TG 

ATGATCTTCCGTATCGCCGTGATGCAACCGATCCATATCGATCCCGAGGCGGTTAAGCCA 
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CCGAAGAGGAGGAACGTCAGGATCTCTAAAGATCCTCAAAGCGTGGCGGCTAGGCATAGA 
AGGGAGAGAATAAGCGAGAGGATTCGGATTTTGCAACGGCTTGTTCCTGGTGGGACGAAG 
ATGGATACAGCTTCGATGCTCGATGAAGC^TTCATTATGTGAAGTTTTTAAAGAAACAG 
GTGCAGTCTCTGGAGGAGCAGGCGGTGGTTACTGGCGGAGGGGGAGGAGGAGGAGGAAGG 
GTTTTGATCGGTGGAGGTGGAATGACGGCGGCGAGTGGTGGTGGTGGCGGCGGGGGAGTG 
GTTATGAAAGGGTGTGGAACAGTGGGGACTCATCAGATGGTGGGCAATGGACAGATTCTT 
AGATGATGATGATTTTTAATTTTATTATTATTATATTAATGTTGGAGAAAAAGAGAAAAA 
TGATTCTGGAGAGGGAAGCCAAGTAATTTATGTGAGAGTCTTTAATTTAACTTTATTTTC 
TTGTTTAGATAATGTGTAATGATGGTTTTTAAAGCCAAAGACTCTCCATGGTTGTTGGAG 

CGAGTTTG 

>G1063 Amino Acid Sequence (domain in aa coordinates: 131-182) 
MDSDIMNMMMHQMEKLPEFCNPNSSFFSPD^ 

GIjLTNPSS I S PNTAYSS VFIJDKRJSn^SNNlW™ IHIDPEAVKP 
PKRRNVRI S KDPQS VAARHRRER I S ERIRILQRIiVPGGTKMDTASMIiDEAIHYVKFLKKQ 
VQSLEEQAVVTGGGGGGGGRVLIGGGGMTAASGGGGGGGVVMKGCGTVGTHQMVGNAQIIj 

R* 

>G1140 (67.. 729) 

ATCCAAGATCCTCCAACTC^C^GAAAGG(^GATTCAAGAACAGTAGTGAAGGAGAGATCT 

GGTAAAATGGCGAGAGAGAAGATAAGGATAAAGAAGATTGATAACATAACAGCGAGACAA 

GTTACTTTCTCAAAGAGAAGAAGAGGAATCTTCAAGAAAGCCGATGAACTTTCAGTTCTT 

TGCGATGCTGATGTTGCTCTCATCATCTTCTCTGCCACCGGAAAGCTCTTCGAGTTCTCC 

AGCTGAAGAATGAGAGAC^TATTGGGAAGGTATAGTCTTCATGCAAGTAACATCZAACAAA 

TTGATGGATCCACCTTCTACTC^TCTCCGGCTTGAGAATTGTAACCTCTCCAGACTAAGT 

AAGGAAGTCGAAGACAAAACCAAGCAGCTACGGAAACTGAGAGGAGAGGATCTTGATGGA 

TTGAACTTAGAAGAGTTGCAGCGGCTGGAGAAACTACTTGAATCCGGACTTAGCCGTGTG 

TCTGAAAAGAAGGGCGAGTGTGTGATGAGCCAAATTTTCTCACTTGAGAAACGGGGATCG 

GAATTGGTGGATGAGAATAAGAGACTGAGGGATAAACTAGAGACGTTGGAAAGGGCAAAA 

CTGACGACGCTTAAAGAGGCTTTGGAGACAGAGTCGGTGACCACAAATGTGTCAAGCTAC 

GACAGTGGAACTCCCCTTGAGGATGACTCCGACACTTCCCTGAAGCTTGGGCTTCCATCT 

TGGGAATGAATCTGAGAGAGAGAAAGATCCAGCAGAGTTGACTTCGATGGAAGCCCACAA 

ATATTAAGTCTACCTTTTCCCTTTCTTTTCTTTGAATAAGTGTTGAAAAAGAATTGAGAT 

GGGAAGGATGAATTCTCATTGCATTGCAGAGAAGCAAGTTTCAGATATTGTACGTGTTAT 

TGGGTCTTTATAACTATTTTTCTCCCCAAAAAAAAAAAAAAAAAAAAAAAAATU^AA 

>G1140 Amino Acid Sequence (conserved domain in AA coordinates : 2-57) 

MAI^KIRIKKIDNITARQVTFSKRRRGIFKKADELS 

RMRDILGRYSIiHASNINKLMDPPSTHL^ 

LEELQRLEKLLESGLSRVSEKKGECVMSQIFSLEKRGSEIjV^ 
TIjKI^ETESVTTNVSSYDSGTPLEDDSDTSIiKLGIjPSWE* 
>G1425 (43 . .1005) 

ACTCTCTCAAACC^TAAAAAATATTOTCCGATC^TCATTTTAATGGAGAGTACAGATTCT 
TCCGGTGGTCCTCCGCCGCCGCAACCAAACCTCCCTCCAGGATTCCGGTTTCATCCAACA 
GACGAAGAACTTGTAATTCATTACCTCAAACGCAAAGCAGATTCTGTTCCTTTACCAGTC 
GCGATC^TCGCCGACGTTGATCTTTACAAATTTGATCCATGGGAACTTCCCGCGAAAGCT 
TCGTTTGGAGAACAAGAATGGTATTTTTTCAGTCCAAGAGATCGGT^AATATCCCAACGGA 
GCTAGACCTAACCGAGCTGCGACTTCCGGTTATTGGAAAGCGACTGGTAC^GATAAACCG 
GTGATTTCAACCGGCGGTGGTGGTAGTAAAAAAGTGGGAGTTAAAAAGGCTCTAGTGTTT 
TACAGTGGTAAACCACCAAAAGGAGTTAAATCAGATTGGATTATGCATGAATATCGGTTA 
ACTGATAATAAACCXACTCACATTTGTC^CT 

GATGATTGGGTGTTGTGTCGTATCTAC^^GAAAAACAATAGTACAGCATCTAGACATCAT 
CATCATCTTCATCATATTCATCTAGATAATGATCATCATCGTCATGATATGATGATTGAT 
GATGATCGATTCCGTCATGTTCCTCCTGGTCTTCACTTCCCGGCGATTTTTTCTGACAAT 
AATGATCCGACGGCTATATATGATGGTGGCGGCGGCGGATACGGAGGTGGAAGTTACTCG 
ATGAATCATTGTTTCGCATCTGGATCAAAGCAGGAGCAGTTGTTTCCACCGGTGATGATG 
ATGACTAGTCTAAATCAAGATTCCGGTATTGGATCGTCGTCGTCACCTAGCAAGAGATTT 
AACGGCGGCGGCGTTGGAGATTGTTCGACTTCTATGGCGGCGACGCCGTTAATGCAGAAC 
CAAGGTGGGATTTACCAATTGCCTGGTTTGAATTGGTATTCT 

AAGAATTTTTATUU^TTTGTGTATATATATACGGTTTGAGTGATTAGGGGGCATTGGGGGA 
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TTTATTTACGGTTGATTATTATTGTAGTGTTATAGAACTAAGGAGATTAAATTAAATAGA 
TTGGAGGAAAAAAAAAAAAAAAAA 

>G1425 Amino Acid Sequence (domain in AA coordinates: 20-173) 

MESTDSSGGPPPPQPNLPPGFRFHPTDEELVIHYLKRKADSVPLPVAIIADVDLYKFDPW 

ELPAKAS FGEQEWYFFS PRDRKYPNGARPNRAATSGYWKATGTDKPVT S TGGGGS KKVGV 

KKALVFYSGKPPKGVKSDWIMHEYRDTDNKPTHIC^ 

TASRHHHHIiHHIHLDNDHHRHDMMIDDDRFRHVPPGLHFPAIFSDNN^ 

GGGSYSMNHCFASGSKQEQLFPPVMMMTSIjNQDSGIGSSSSPSKRFNGGGVGDCSTSMAA 

TPLMQNQGGI YQLPGLNWYS * 
>G1449 (105.. 581) 

TAGACAGAGAGAAATAGAAATAGAGAGAGAGAGACATGAAGAGCACTCTCAATAGAGAAG 
AGAAGGAAGCATGAAGCTAGCTCTGCAGCTTCAAGGTCTCATTAATGGAGGTCTCTAACT 
CTTGTTCTTCATTTTCTTCATCCTCTGTCGACAGTACTAAACCTTCTCCTTCTGAATCTT 
CTGTTAATCTCTCCCTTAGTCTC^CATTTCCTTCTACTTCTCC^CAAAGAGAAGC^AGAC 
AAGATTGGCCACCGATAAAGTCTAGATTAAGAGATACACTAAAGGGTCGTCGTCTTCTTC 
GTCGTGGTGATGACACTTCTCTCTTTGTTAAGGTTTATATGGAAGGTGTTCCCATTGGAA 
GAAAACTCGACCTTTGCGTATTOTCAGGCTACGAGAGTCTATTAGAAAATCTCTCTCACA 
TGTTCGATACTTCAATCATCTGCGGTAATCGAGATCGAT^AACATCATGTTTTGACATATG 
AAGACAAGGATGGAGATTGGATGATGGTCGGAGATATTCCATGGGATATGTTTCTTGAAA 
CCGTGAGAAGACTAAAGATC^CGAGACCGGAGAGGTATTAAAACTTGGATCGGTCAAGGC 
TGTGATTGCGCAGTTACGAGACGTGTAAGATTTAGGCATTGATGAAGAGACTTGAGGCGG 
GACGGAGCTATTGCTGCATATTGCAACAAAGGCCTTGAAGAAGTTGGAGAATTGATTGAT 
GCATATATTTATTTATATGACACCTTTGAGTGTGTTTTTTCTTATAAATAAATCACAATA 

TCCAAGACTTCTCTTTAAA 

>G1449 Amino Acid Sequence (domain in AA coordinates: 48-53,74-107,122-152) 
MEVSNSCSSFSSSSVDSTKPSPSESSVlsn^ 

GRRLLRRGDDTSIiFVKVradEGTO * 1 CGNRDRKH 

HVLTYEDKDGDWMMVGD I PWDMFIiETVRRLKITRPERY * 
>G1897 (1..678) 

ATGCCTTCTGAATTCAGTGAATCTCGTCGGGTTCCTAAGATTCCCCACGGCCAAGGAGGA 
TCTGTTGCGATTCCGACGGATCAACAAGAGCAGCTTTCTTGTCCTCGCTGTGAATCAACC 
AACACCAAGTTCT'GTTACTACAACAACTACAACT 

TCTTGTCGCCGTTACTGGACTCATGGAGGTACTCTCCGTGACATTCCCGTCGGTGGTGTT 
TCCCGTAAAAGCTCAAAACGTTCCCGGACTTATTCCTCTGCCGCTACCACCTCCGTTGTC 
GGAAGCCGGAACTTTCCCTTACAAGCTACGCCTGT^ 

GGCGGTATCACGACGGCGAAGGGAAGTGCTTCGTCGTTCTATGGCGGTTTCAGCTCTTTG 
ATCAACTAC^CGCCGCCGTGAGCAGAAATGGGCCTGGTGGCGGGTTTAATGGGCCAGAT 
GCTTTTGGTCTTGGGCTTGGTCACGGGTCGTATTATGAGGACGTCAGATATGGGCAAGGA 
ATAACGGTCTGGCCGTTTTCAAGTGGCGCTACTGATGCTGCAACTACTACAAGCCACATT 
GCTCAAATACCCGCGACGTGGCAGTTTGAAGGTCAAGAGAGCAAAGTCGGGTTCGTGTCT 
GGAGACTACGTAGCGTGA 

>G1897 Amino Acid Sequence (domain in AA coordinates : 34-62) 
MPSEFSESRRVPKIPHGQGGSVAIPTDQQEQLSCPRCESTNTKFCYYOTTYNFSQPRHFCK 
S CRRYWTHGGTLRD I PVGGVSRKS SKRSRTYS SAATTS WGSRNFPLQATPVliFPQS S SN 
GGITTAKGSASSFYGGFSSIiINYNAAVSRNGPGGGFNGPDAFGLGLGHGSYYEDVRYGQG 
ITVWPFS SGATDAATTTSH I AQ I PATWQFEGQESKVGFVSGDYVA* 
>G2143 (89.. 784) 

TCTTCTTCTTCCTCCATACCTTATCTCACCAGCTTCTCCATATCTCTCAAAGAAAAAACA 
AACCCTATAAATTCCACAAAAAAGGAGGATGGATAACTCCGACATTCTAATGAACATGAT 
GATGCAGCAGATGGAGAAGCTTCCTGAACACTTCTCTAACTCAAACCCTAACCCTAATCC 
CCATAACATTATGATGCTTTCTGAATCCAACACCCACCCGTT 

TTCTCATCTCCCATTTGACCAAACCATGCCTCACCACCAACCCGGTTTAAATTTCCGGTA 
CGCCCCCTCCCCGTCATCATCTCTCCCGGAGAAGAGAGGAGGCTGCAGCGACAACGCCAA 
CATGGCGGCGATGAGAGAGATGATCTTTCGAATAGCCGTGATGCAGCCTATACATATTGA 
TCCGGAATCCGTAAAGCCACCAAAGAGAAAGAACGTGAGGATCTCTAAGGATCCACAGAG 
CGTGGCAGCTCGGCATCGAAGGGAGAGGATAAGCGAGCGGATTCGGATTCTTCAGCGGCT 
TGTTCCCGGTGGGACTAAGATGGATACGGCGTCGATGCTCGATGAGGCTATCCATTACGT 
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TAAGTTTCTCAAGAAGCAAGTGCAGTCGCTGGAGGAACATGCGGTGGTTAACGGCGGAGG 
AATGACGGCGGTGGCCGGAGGAGCACTTGCGGGTACTGTTGGTGGAGGATATGGAGGAAA 
AGGGTGTGGCATTATGCGGTCTGATCATCACCAGATGCTTGGAAATGCACAGATTCTTAG 
ATGATGATGATGTTGATTTTTAAATATATATCATATGTTTATTAATATGACGGGAAAAAA 

TTTATCTTTCCGGGTTTCGATAATGTTTGGGATGGTTAATO 

AACTTGGTTGTAAAGACTAAAGAATAAGCATAGTTTATCAATTTATCATTACTAAATGAA 
ATAG 

>G2143 Amino Acid Sequence (domain in aa coordinates: 128-179) 
MDNSDILMNMMMQQMEKLPEHFSNS 

PHHQPGLNFRYAPSPSSSIiPEKRGGCSDNANMAAMREMIFRIAVMQPIHIDPESVKPPKR 
KNVRISKDPQSV7VARHRRERISERIRILQRLVPGGTI^ 
LEEHAVVNGGGMTAVAGGAIAGTVGGGYGGKGCGIMRSDHHQMIiGNAQILR* 
>G2535 (1..1005) 

ATGAACATATCAGTAAACGGAGAGTCACAAGTACCT 

GAGGAAGAGCTCTTGAAGTATTACCTCCGCAAGAAAATCTCTAACATCAAGATCGATCTC 
' GATGTTATTCCTGACATTGATCTCAACAAGCTCGA 
AAGATTGGAACGACGCCGCAAAACGATTGGTACTTTTATAGCCATAAGGACAAGAAGTAT 
CCCACCGGGACTAGAACCAA(^GAGCCACC^CGGTCGGATTTTGGAAAGCGACGGGACGT 
GACAAGACCATATATACC^ATGGTGATAGAATCGGGATGCGAAAGACGCTTGTCTTCTAC 
AAAGGTCGAGCCCCTCATGGTCAGAAATCCGATTGGATCATGCACGAATATAGACTCGAC 
GAGAGTGTATTAATCTCCTCGTGTGGCGATCATGACGTCAACGTAGAAACGTGTGATGTC 
ATAGGAAGTGACGAAGGATGGGTGGTGTGTCGTGTTTTCAAGAAAAATAACCTTTGCAAA 
AACATGATTAGTAGTAGCCCGGCGAGTTCGGTGAAAACGCCGTCGTTCAATGAGGAGACT 
ATCGAGCAACTTCTCGAAGTTATGGGGCAATCTTGTAAAGGAGAGATAGTTTTAGACCCT 
TTCTTAAAACTCCCTAACCTCGAATGCC^TAACAACACCACCATCACGAGTTATCAGTGG 
TTAATCGACGACCAAGTCAACAACTGCCACGTCAGCAAAGTTATGGATCCCAGCTTCA 
ACTAGCTGGGCCGCTTTGGATCGGCTCGTTGCCTCACAGTTAAATGGGCCCAACTCGTAT 
TCAATACC7VGCCGTTAATGAGACTTCACAATCACCGTATCATGGACTGAACCGGTCCGGT 
TGTAATACCGGTTTAACACCAGATTACTATATACCGGAGATTGATTTATGGAACGAGGCA 
GATTTCGCGAGAACGACATGCCACTTGTTGAACGGTAGTGGATAA 

>G2535 Amino Acid Sequence (conserved domain in AA coordinates : 11-114) 
MNISWGQSQVPPGFRFHPTEEELLKVYLRK^^ 

KIGTTPQNDWYFYSHKDKKyPTGTRTNRATTVGFWKATGRDKTIYTNGDRIGM 

KGRAPHGQKSDWIMHEYRLDESVTjI S S CGDHDVNVTITCDVI GSDEGVAATCRVFKKNNIiCK 

l^ISSSPASSVTCTPSFlSrEETIEQLLEVMGQSCKGEIV^^ 

LIDDQVITOC^SKMVIDPSFITSWAALDRL^^ 

CNTGLTPDYYIPEIDLWNEADFARTTCHIiLNGSG* 

>G2557 (94.. 1215) 

TCGACTTCCTGTGAACTCATCTGTTTGTTCTCTTCTTCCGGTTTCACTTTTTCATGTCCT 

GCCGTTATTACAACGAGGATTGTGTTTGATCCGATGGAAGGATTGGAATCTGTGTACGCT 

CAAGCTATGTATGGAATGACACGAGAGAGCAAAATCATGGAGC^TCAAGGATCAGATTTG 

ATTTGGGGAGGAAATGAGCTAATGGCTCGAGAACTCTGTTCTTCTTCTTCTTATCACCAC 

CAACTCATTAATCCGAATCTTAGCAGCTGTTTCATGTCTGATCTTGGAGTCTTAGGTGAG 

ATTCAACAGCAGCAACATGTTGGCAACAGAGCTAGCTCGATAGATCCATCATCA 

TGTTTGTTATCTGCGACGTCGAATAGCAACAACACCTCGACGGAGGACGATGAAGGAATA 

TCTGTGCTTTTCTCAGATTGTCAGACTCTTTGGAGCTTTGGTGGAGTCTCATCTGCAGAG 

TCTGAGAACAGAGAGATCACTACTGAGACGACAAC^CGATAAAGCCTAAGCCTTTGAAG 

AGAAACAGAGGAGGAGATGGAGGAACTACTGAGACTACAAC^CAACAACAAAACCTAAG 

TCTTTGAAGAGAAACAGAGGAGACGAGACAGGAAGTCACTTTAGTCTTGTTCATCCTCAA 

GATGATTCGGAGAAAGGAGGTTTCAAGCTTATATACGATGAGAATCAATCGAAATCAAAG 

AAACCAAGAACAGAGAAAGAACGAGGCGGTTCTTCGAACATTAGTTTCCAACATTCAACT 

TGTTTGTCTGACAATGTCGAGCCCGATGCTGAGGCGATTGCACAAATGAAGGAGATGATA 

TACAGAGCGGCTGCATTTAGACCGGTGAATTTCGGGTTAGAGATTGTGGAGAAGCCTAAG 

AGGAAGAACGTCAAGATATCGACGGATCCTCAAACGGTTGCAGCGAGACAGAGAAGGGAG 

AGGATAAGTGAGAAGATTAGGGTTTTACAAACATTGGTTCCAGGTGGGACGAAGATGGAT 

ACTGCATCAATGCTTGATGAAGCTGCTAATTATCTCAAGTTCCTTAGAGCACAAGTAAAA 
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GCTTTAGAAAACTTGAGACCCAAGCTTGACCAAACGAA^ 

ACATCGTTTCCATTATTCCACCCATCTTTTCTTCCATTGCAAAATCCTAATCAAATCCAT 

CATCCAGAGTGTTGACAGATTATAAACTTTTGAGTTTCATCATCATCAACAGAATCATG^ 

CGTCTTGATTGTTTTAGCAGTTCTCAAGAAAGGCAACTTCTGTGACAAGGGTGGTGTCGG 

GCAGTGTTGTTTAC^CTTTCCAGTCTTTGTTTTGCATTTCTTTTTATA 

TTTATATAGAATCTGTGGAATTCGAGGGTTGAAATATTGTGAAAAACAGAGCCGCAAGAG 

GTTAATTACAGTCTCTGCAATATTTTCAACCTTTTATTACTTTATTAGAGTAAAGATAGC 

GT 

>G2557 Amino Acid Sequence (domain in aa coordinates: 278-328) 
MEGLESVYAQAMYGMTRESiai^HQGSDLIWGG 

MSDLGVLGEIQQQQHVGNRASSIDPSSLDCLLSATSNSNNTSTEDDEGISVLFSDCQTIiW 
SFGGVSSAESENREITTETTCTIKPKPLKRmGGDGGTT^ 

SHFSLVHPQDDSEKGGFKLIYDENQSKSKKPRTEKERGGSSNISFQHSTCLSDNVEPDAE 

AIAQMKEMIYRAAAFRPVNFGLEIVEKPKRKNVKISTDP 

LVPGGTKMDTASMLDEAANYLKFLRAQVKALENIiRPKXiDQTNLSFSSAPTSF 

PLQNPNQ IHHPEC * 

>G259 (52.-786) 

GAGATCTTCTACTACTTGTTTTCITCAAGAATAATAATTTTCGTTTTATATATGGA 

GCTGGTGAACATTTACGGTGTAACGATAACGTTAACGACGAGGAGCGTTTGCCATTGGAG 

TTTATGATCGGAAACTCAACATCCACGGCGGAGCTACAGCCGCCTCCACCGTTCTTGGTA 

AAGACATACAAAGTGGTGGAGGATCCGACGACGGACGGGGTTATATCTTGGAACGAATAC 

GGAACTGGTTTCGTCGTGTGGCAGCCGGCAGAATTCGCTAGAGATCTGTTACCAACACTT 

TTCAAGCATTGCAACTTCTCTAGCTTCGTTC^ 

GTAACGACGATAAGATGGGAATTTAGTAATGAGATGTTTCGAAAGGGGCAAAGAGAGCTT 
ATGAGCAATATCCGAAGAAGGAAGAGCCAACATTGGTCA^ 

GTTGTACCAACAACAACGATGGTGAATCAAGAAGGTCATCAACGGATTGGGATTGATCAT 
CACCATGAGGATCAACAGTCTTCCGCCACTTCATCCTCTTTCGTATACACTGC^TTACTC 
GACGAAAACAAATGCTTGAAGAATGAAAACGAGTTATTAAGCTGCGAACTTGGGAAAACC 
AAGAAGAAATGCAAGCAGCTTATGGAGTTGGTGGAGAGATACAGAGGAGAAGACGAAGAT 
GCAACTGATGAAAGTGATGATGAAGAAGATGAAGGGCTTAAGTTGTTCGGAGTAAAACTT 
GAATGAAACTAGATTGCTAGATTGATATTCGTAATATACCAGTTTCTTCATATTCTTAGA 
AGTTTTGGATAACTATATATAGTACTCTTT 

>G259 Amino Acid Sequence (domain in AA coordinates: 27-131) 

MEDAGEHLROTONVNDEERLPLEFMIGNSTSTA^ 

NEYGTGFVVWQPAEFARDLLPTIiFKHCNFSSFVRQIiNTYGF 

RELMSNIRRRKSQHWSHNKSNHQVVPTTTMVNQEGHQRIGIDHHHE 

ALLDENKCLKNEI^LIjSCELGKTKKKCKQLMEIjVERYRGEDEDATDESDDEEDEGLK^ 

VKIiE* 

>G353 (82.. 570) 

ACCAAACTCAAAAAACAGAAACCACAAGAGGATCATTTCATT 

ATCATCATCATCAGAAGAAAAATGGTTGCGATATCGGAGATCAAGTCGACGGTGGATGTC 
ACGGCGGCGAATTGTTTGATGCTTTTATCTAGAGTTGGACAAGAAAACGTTGACGGTGGC 
GATCAAAAACGCGTTTTCACATGTAAAACGTGT^ 

TTAGGAGGTCACCGTGCGAGTCACAAGAAGCCTAACAACGACGCTTTGTCGTCTGGATTG 
ATGAAGAAGGTGAAAACGTCGTCGCATCCTTGTCCCATATGTGGAGTGGAGTTTCCGATG 
GGACT^AGCTTTGGGAGGACACATGAGGAGACACAGGAACGAGAGTGGGGCTGCTGGTGGC 
GCGTTGGTTACACGCGCTTTGTTGCCGGAGCCCACGGTGACTACGTTGAAGAAATCTAGC 
AGTGGGAAGAGAGTGGCTTGTTTGGATCTGAGTCTAGGGATGGTGGACJAATTTGAATCTC 
AAGTTGGAGCTTGGAAGAACAGTTTATTGATTTTATTTATTTTCCTTA7UVTTTTCTGAAT 
ATATTTGTTTCTCTCATTCTTTGAATTTTTCTT7^ATATTCTAGATTATACATACATCCGC 
AGATTTAGGAAACTTTCATAGAGTGTAATCTTTTCTTTCTGTAAAAATATATTTTACTTG 
TAGCAAA 

>G353 Amino Acid Sequence (domain in aa coordinates: 41-61, 84-104) 

MVAISEIKSTVDVTAANCLMLLSRVGQENVDGGDQKRVFTCKTCLKQFHSFQALGGHRAS 

HKKPNNDAXjSSGLMKKVKTSSHPCPICGVEFPMGQALGGHMRRHRNESGAAGGALVTRAL 

LPEPTVTTLKKSSSGKRVACLDLSLGMVDNIj^ 

>G354 (27. .533) 
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CCTAGAAGTCACTAAGTCGATTCAAAATGGTTGCGAGAAGTGAGGAAATTGTGATAGTGG 
AAGAAGATACGACTGCGAAATGTTTGATGTTGTTATCAAGAGTCGGAGAATGCGGCGGCG 
GCTGCGGGGGAGATGAACGTGTTTTCCGATGCAAGACTTGTCTTAAAGAGTTCTCATCGT 
TTCAAGCTTTGGGAGGTCATCGTGCAAGCCACAAGAA^^ 

(^CTTCTTGGATCCTTGTCCAACAAGAAAACTAAAACGTCTC^TCCTTGTCCGATATGTG 

GAGTGAAGTTTCCGATGGGACAAGCTCTTGGTGGTCACATGAGGAGACATAGGAACGAGA 

AAGTCTCAGGCTCGTTGG1TACACGTTCTTTTCTACCGGAGACGACGACGGTGACGGCTT 

TGAAGAAATTTAGTAGTGGGAAGAGAGTGGCTTGTTTGGATTTGGACTTAGATTCGATGG 

AGAGTTTGGTCAATTGGAAGTTGGAGTTGGGAAGAACGATTTCTTGGAGTTAAGTTTTTG 

GGTTGTATACAGTTTC^CATGATTTTGTAATCTTTGTT^ 

ATGTGAATATTATTTTGATACAATAAAA 

>G354 Amino Acid Sequence (domain in aa coordinates: 42-62, 88-109) 
MVARSEEIVIVEEDTTAKCLMIiIiSRVGECGGGCGGDERVFRCICrCLKEFSSFQALGGHRA 

SHKKLINSDNPSLLGSLSNKKTKTSOT 

SFLPETTIVTALKICFSSGKRVACIjDLDLDSMESLVNWKLELGRTISWS* 

>G638 (86.. 1861) CTATCGATAATTGATCTTCTCT 

TTCGGCTGAATATAAATCTGAAAAAATGGATCAAGATCAGCATCCTGAGTACGGT 

GGAGCTCCGGCAGCTCATGAAAGGCGGAGGAAGGACGACTACTACAACACCGTCTACTTC 

TTCT(^TTTTCCCTCTGATTTCTTCGGTTTTAACCTTGCTCCGGTGCAGCCACCGCCACA 

CCGTCTTCATCAGTTCACTACTGATCAAGATATGGGTTTCTTGCCACGTGGCATACATGG 

ATTGGGTGGAGGTTCTTCAACGGCTGGAAATAACAGTAACTTAAACGCGAGTACTAGTG^ 

TGGAGGAGTTGGGTTTAGTGGGTTTCTTGACGGTGGTGGTTTCGGCAGCGGAGTAGGAGG 

AGACGGTGGAGGAACTGGAAGGTGGCCGAGACAAGAAACCCTAACTCTGTTGGAAATTAG 

ATCTCGTCTTGATCATAAATTCAAAGAAGCTAATCATAAAGGACCTCTTTGGGATGAAGT 

TTCTAGGATTATGTCCGAGGAACATGGATACCAAAGGAGTGGGAAGAAATGCAGAGAGAA 

GTTTGAGAATCTGTACAAATACTATAGTAAGACTAAAGAAGGCGAAGGCGGAAGACAAGA 

CGGAAAAC^TC^C^GATTTTTCCGC<^GCTC(^GCGCTATACGGGGATTCTAATAACOT 

GGTTTCTTGTCCCAATCATAACACGCAGTTCATGAGCAGTGCTCTTCATGGTTTCCATAC 

TCAAAACCCTATGAACGTTGCTACAACAACGTCCAACATCC^TAACGTTGATAGTGTTCA 

TGGTTTTCATCAAAGCCTTAGTCTTTCTAACAACTACAACTCCTCCGAGCTTGAGCTGAT 

GACTTCCTCTTCGGAAGGGAATGATTCTAGTAGTAGAAGGAAAAAGAGGAGTTGGAAAGC 

GAAGATAAAGGAGTTCATTGATACGAACATGAAAAGGTTGATAGAGAGGGAAGATGTTTG 

GCTTGAGAAGTTGACAAAGGTTATTGAAGACAAAGAGGAACAACGGATGATGAAAGAAGA 

GGAATGGAGGAAGATTGAAGCTGCAAGGATTGATAAAGAGCATTTGTTTTGGGCTAAAGA 

GAGGGCGAGGATGGAAGCTAGGGATGTTGCGGTGATTGAGGCATTGCAATACTTGACAGG 

AAAGCCATTGATAAAGCCGCTGTGTTCATCCCCGGAAGAGAGGACAAATGGTAATAATGA 

GATCCGAAACAATAGTGAGACACAGAATGAGAATGGAAGCGATCAAACGATGACTAACAA 



GATAAGAACGAGCATGGACTCGACCTTTCAAGAGATATTAGGAGGGTGCTCGGATGAGTT 
TCTATGGGAGGAAATCGCAGCGAAGTTGATTCAGTTAGGGTTTGATCAGAGAAGTGCCTT 
ATTATGCAAGGAAAAGTGGGAATGGATAAGCAATGGAATGAGGAAAGAAAAGAAGCAAAT 
' CAACAAGAAAAGAAAGGATAATTCGTCCAGCTGCGGCGTGTACTACCCGAGAAACGAAGA 
AAATCCAATCTACAATAATCGAGAAAGTGGATATAATGATAATGATCCGCATCAAATCAA 
CGAACAAGGGAATGTAGGTTCTTCAACATGAAACGCAAA 

TGGAAATCCGAGCGGTGCAATGGCTGCTAGTACAAACTGCTTCCCGTTCTTCATGGGAGA 
TGGAGATCAGAATTTGTGGGAGAGTTATGGTTTGAGGCTCAGTAAAGAAGAGAATCAGTA 
AGTAATTTCTCTTAATGAAGAAGAAGAAGGTAATCATGTGGTTAACTAATTCTTTTGAGT 
TAGCTATATATGAGATAAACCTTGACTTAGCTATTATATGTCACATGCTGCTTAGAATTA 
AGAAATATTTGTTGGGGCTTAACGAATTATATATCAGCATATATAAGATGAGAGTCTAAG 
AATTATATCAAATTAGGCTTTAACCAACGTACGATTATATATTATGTTTTCATGTATTTA 
TTCTGTAAGACTTTTTAATATCAATCTTTCTCTAAA 

>G638 Amino Acid Sequence (domain in AA coordinates: 119-206) 

I^QDQHPQYGIPELRQI^KGGGRTTTTTPSTSSHFPSDFFGFNLAPVQPPPHRLHQFTTD 

QDMGFLPRGIHGLGGGSSTAGNNSNIiNASTSGGGVGFSGFLDGGGFGSGVGGDGGGTGRW 

PRQETIiTLLEIRSRLDHKFKEANHKGPLWDEVSRIMSEEHGYQRSGKKCREKFENLYKYY 

SKTKEGBAGRQDGKHHRFFRQIiQALYGDSNNLVSCPNHOT 
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TTSNIHNVDSVHGFHQSLSLStfNW 
NMKRLIERQDVWLEKLTKVIEDKEEQRMMKE 

VAVI EALQYLTGKPL I KPLCS S PEERTNGNNE IRNNSETQNENGSDQTMTNNVCVKGS S S 

CWGEQEILKLMEIRTSMDSTFQEILGGCSDEFLWEEIAAKLIQLGFDQRSAIjIjCKEKWEW 

ISNGMRKEKKQINKKRKDNSSSCGVYYPRNEENPIYNNI^ 

TSNANANANVTTGNPSGAMA^ 

>G869 (428.. 1402) 

AGGAACAGTGAAAGGTTCGGTTTTTTGGGTTTCGATCTGATAATCAACAAGAAAAAAGGG 
TTTGATTTATGTCGGCTGGGTTTGAATCGACTGTGATTTTGTCTTTGATTCATATCTCTT 
CTCCGATTTCATCATCATCTTCCCCAT 

CTCTTCACTTCTGCTGTAATAAGCAGAGGCTTGTTCTGGAGAC^CCTTCTCTTTCCATGC 
GCTTAAGACCCAAAAGGACTTGTTCTAGTGTTGAAGTCTTTGGGGGTTTTCACATAAAGC 
AGCAAAAGTTTTCTTTTTTCATAGTTCGCTGAGAGTTTTGAGTTTTG 
TTTGACCTTTTAGAGTGATTTTTTGTTCOT 

TTTAACAATGGTTGCGATTAGAAAGGAACAGTCTTTGAGTGGTGTTAGTAGCGAGATTAA 

GAAGAGAGCTAAGAGAAACACTCTATCGTCCCTTCCTCAAGAAACCCAACCTTTGAGGAA 

AGTCCGTATTATTGTGAATGATCCTTATGCTACTGATGATTCCTCTAGTGATGAGGAAGA 

GCTTAAGGTTCCTAAGCCAAGGAAAATGAAACGTATCGTTCGTGAGATTAACTTTCCTTC 

TATGGAAGTTTCTGAACAGCCTTCTGAGAGTTCTTCTCAGGACAGTACTAAAACTGATGG 

CAAGATAGCTGTGTCAGCTTCTCCTGCTGTTCCTAGGAAGAAGCCTGTTGGTGTTAGGCA 

AAGGAAATGGGGGAAATGGGCTGCTGAGATTAGAGATCCTATTAAGAAAACTAGGACTTG 

GTTGGGTACTTTTGATACTCTTGAAGAAGCTGCTAAAGCTTATGATGCTAAGAAGCTTGA 

GTTTGATGCTATTGTTGCTGGAAATGTGTCCACTACTAAACGTGATGTTTCTTCATCTGA 

GACTAGCCAATGCTCTCGTTCTTCACCTGTTGTTCCTGTTGAGCAA.GATGACACrTCTGC 

ATCAGCTCTCACTTGTGTCAACAACCCTGATGACGTCTCGACCGTTGCTCCAACTGCTCC 

AACTCCAAATGTTCCTGCTGGTGGAAACAAGGAAACGTTGTTCGAl^TCGACTTTACTAA 

TCTACAGATCCCTGATTTTGGTTTCTTGGCAGAGGAGCA 

TTTCCTCGCGGATGATCAGTTTGATGATTTCGGCTTC 

AGATAACGGTCCAAGTGCGTTACCAGATTTCGACTTTGCGGATGTTGAAGATCTTCAGCT 
AGCTGACTCTAGTTTCGGTT^CCTTGATCAACTTGCTC 

AAAAAGTTTTGCAGCTTCATAGGATCTTGCTTAGTAATGTTAAGTGAGAAGAGTGTTTTG 
TTTTTTCGTTTATGCTTTAGTAATTTAAGACATACAAAAGTGTGTGTTCCGGATTGTAGT 
AAGATCTTAAGACATAAAGCCGGGTTTTGCAATTAGGAATCGAGTTTTAATGAAGTTTTA 
GTTTATGTTTG 

>G869 Amino Acid Sequence (domain in AA coordinates: 109-177) 
MVAIRKEQSLSGVSSEIKKRAKRNTIjS SLPQETQPLRKVRI I VNDPYATDD S S SDEEELK 
VPKPRKMKRI VRE INFPSMEVSEQPSES S S QDSTKTDGKIAVSAS PAVPRKKPVGVRQRK 
WGKWAAE IRDPIKKTRTWLGTFDTIiEEAAKAYDAKKLEFDAI VAGNVSTTKiyDVS S SETS 
QCSRS S PVVPVEQDDTSASALTCVNNPDDVSTVAPTAPTPN^ 

IPDFGFIjAEEQQDLDFDCFLADDQFDDFGIjIjDDXQGFEDNGPSALPDFDFADVEDLQIaAD 

SSFGFLDQLAPINISCPLKSFAAS* 

>G1645 (25. .1104) 

CGTCGACCTCCCAACACTAACTCCATGTTTATAACGGAAAAACAAGTGTGGATGGATGAG 

ATCGTCGCAAGAAGAGCTTCTTCTTCTTGGGACTTCCCTTTCAACGACATTAATATTCAT 

CAGCATCATCATCGTCACTGGAZVCACAAGTCATGAGTTTGAAATCTTGAAGAGTCCTCTT 

GGAGATGTAGCGGTTCACGAAGAAGAGAGTAATAATAATAACCCTAATTTCAGTAACAGC 

GAGAGTGGTAAGAAGGAGACAACAGATAGTGGTCAGTCTTGGTCCTCGTCGTCTTCAAAA 

CCATCGGTCTTGGGGAGAGGACATTGGAGACCAGCTGAAGATGTTATy^CTCZAAAGAGCTT 

GTCTCCA.TTTACGGCCCACAAAACTGGAACCTCATAGCTGAAAAGCTTCAAGGAAGATCT 

GGGAAGAGCTGTAGACTACGATGGTTTAACCAATTGGACCCGAGGATAAACCGAAGAGCT 

TTCACAGAAGAAGAAGAGGAGAGGCTGATGCAAGCACATAGGCTTTATGGTAACAAATGG 

GCAATGATTGCGAGGCTTTTCCCTGGTAGAACTGATAATTCAGTGAAGAACCATTGGCAT 

GTTGTCATGGCTCGTAAGTATAGAGAACACTCTTCTGCTTACCGTAGGAGAAAGCTTATG 

AGTAATAATC(^CTTAAACCTGACCT<^C 

TACCACTCTTTTATCTCCACTAATCAT^ 

ACTCATCACCTGGTTAATAATGCCCCTATCACGAGTGACCATAACCAGCTTGTGTTGCCT 
TTCCATTGCTTTCAAGGTTATGAGAACAATGAACCTCCGATGGTTGTGAGTATGTTTGGC 
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AACCAAATGATGGTCGGCGATAACGTTGGTGCCACGTC^GACGCGTTATGC^ATATTCCG 
CACATTGACCCTAGTAACCAAGAGAAACCGGAGCCAAATGATGCAATGCATTGGATCGGA 
ATGGACGCGGTAGATGAGGAGGTGTTCGAAAAGGCTAAGCAGCAACCACATTTTTTCGAT 
TTTCTTGGCTTGGGGACGGCGTGAATGTTGAACAAATTGGTGTTAATCAGATAACGACAG 

TGGC 

>G1645 Amino Acid Sequence (domain in AA coordinates: 90-210) 

MFITEKQWMDEIVARRASSSWDFPFNDINIHQHHH 

ESNNimPNFSNSESGKKETTDSGQSWSSSSSK^ 

WOTjIAEKLQGRSGKSCRLRWFNQIjDPRINRRAFTEEE 

GRTDNS VKIvTHWHVVKARKYREHSSAYRRR 

HYFAQPFPEFNLTHHLVI^APITSDHNQLVLP 

VGATSDALC^IPHIDPSNQEKPEPNDAMHWIG^IDAVDEEVFEKAKQQPHFFDFIlGL»GTA* 
>G1038 (240. .1574) 

GCTCGTTTTCAAATTAAAAACAGGGAGAAATTTGGAAATTCCAGTACGACGGGAGATAAA 
ACCTAACATACGCCATGGTGACCGTTATCTAAACTACGCCAAT^ATATTTGAAGTGTCGTC 
GTTTCATAATAAAACGCAAACAAAAACCC^ 

TCTCGCCACTTTCTCTGCTCTTTTCTTTCTCTCTCTCTl^CTTGTTTTCGCCGGCXSATCA 

TGGAGAAAAGCGGCTTCTCTCCCGTCGGTCTAAGGGTTCTTGTCGTAGACGATGATCCAA 

CTTGGCTCAAGATTCTCGAGAAAATGCTCAAGAAGTGTTCTTACGAAGTAACGACCTGTG 

GATTAGCTAGAGAGGCTTTGAGGTTGCTGAGGGAGCGTAAAGATGGATATGATATCGTGA 

TCAGCGATGTGAACATGCCTGACATGGATGGTTTCAAGCTTCTTGAGCATGTTGGTCTTG 

AATTAGACCTCCCTGTAATAATGATGTCGGTGGACGGCGAAACAAGCCGAGTGATGAAGG 

GAGTGCACACGGGAGCTTGTGATTACCTCTTGAAGCCGATAAGAATGAAGGAGTTAAAGA 

TTATATGGGAACATGTTCTGAGAAAGAAGCTTCAAGAAGTGAGAGATATCGAAGGCTGTG 

GATACGAAGGAGGAGCGGATTGGATGACTCGATACGATGAAGCACATTTTCTTGGAGGTG 

GTGAAGATGTTTCTTTTGGGAAAAA.GAGAAAAGACTTTGACTTTGAGAAGAAGCTTCTTC 

AAGATGAGAGTGATCCATCATCTTCTTCTTCCAAGAAAGCTAGAGTTGTTTGGTCTTTTG 

AGCTTCATCATAAGTTTGTCAACGCCGTTAACCAAATCGGATGCGATCACAAAGCTGGT^ 

CCAAGAAGATATTGGATCTCATGAATGTTCCATGGCTC^CTAGAGAAAATGTTGCAAGCC 

ACOTTCAGAAATATAGACTTTACCTGAGCAGATTAGAGAAAGGAAAGGAGCTCAAGTGTT 

ATTCAGGTGGCGTGAAGAATGCGGATTCATCTCCAAAAGATGTCGAA.GTGAATTCAGGCT 

ACCAAAGCCCTGGGAGGAGCAGCTATGTATTCTCTGGAGGAAATTCTCTGATCCAAAAAG 

CAACAGAGATTGATCCAAAGCCACTTGCTTCAGCTTCTTTGTCTGACCCGAA 

TGATCATGCCTCCGAAAACAAAAAAGACGCGTATAGGATTTGATCCTCCCATTTCCTCCT 

CTGCGTTTGACTCTCTGCTTCCTTGGAATGATGTTCCAGAGGTCCTTGAATCGAAGCCGG 

TTCTGTATGAGAATAGCTTTCTCCAGCAACAACCATTGCCAAGTCAAAGTTCCTATGTTG 

CAATTTCTGCACC^VTCTCTCATGGAGGAGGAAATGAAGCCTCCTTATGAGACACCAGCAG 

GAGGCAGTAGTGTGAATGCAGATGAGTTTCTCATGCCACAAGAGAAGATCCCTACTGTAA 

CCCTTCAAGATTTGGATCCCTC^GCCATGAAGCTGCAGGAGTTCAACACAGAAGGCGAT^ 

CTGAAGAAGCTTGAACTGGGGAACTTCCAGAATCA(^TCATTCTGTTTCTTTAGACACTG 

ACTTAGACTTGACTTGGCTTCAAGGCGAGCGTTTCTTGCAAACACCGACTCCAGTTTCAA 

GATACAGTAGTAGCCCATCACTCCTATCTGAGCTCCCAGCCCACCTTAATTGGTATGGT^A 

ATGAGCGGCTGCCTGACCCTGACGAGTATTCCTTCATGGTAGACCAAGGTTTATTCATAT 

CTTAACCTTGTTCCAATAACTTOTTTTCGTATATTGGTTGGTGTAATGCAGAAAGATTTT 

GTGGGTATACCTGAAAATAATCTTGCTTTCCCAAGAACCTTCCATGATCGGATGCATTGT 

ACAATAATCCACGAGTGTCGTAGGCTAATTACACC^VAACAGGTTGATGACAGTGATAAGG 

CCACATGTTTCACACCGTCGCTTAAGATCTTTACTGTCACCTGGAAGGAAA 

>G1038 Amino Acid Sequence (domain in AA coordinates: 198-247) 

MEKSGFS PVGLRVLVVDDDPTWLKILEKMLKKCS YEVTTCGIiAREALRIjLRERKDGYD IV 

ISDVIMPD^GFKLLEHVGLELDLPVII^S 

IIWQHVLRKKLQEVRDIEGCGYEGGADWITRYDEAHFLGGGEDVSFGKKRK^ 

QDE SDPSS S S S KKARWWS FELHHKFVNAVNQ IGCDHKAGPKKILDIjMNVPWIjTRENVAS 

HLQKYRIjYIjSRIjEKGKELKCYSGGVKNADSSPKDVEVNSGYQSPGRSSYVFSGGNSLIQK 

ATEIDPKPLASASLSDPNTDVTMPPKTKKTRIGFDPPISSSAFDSLIjPWNDVPEVLESKP 

vlyensflqqqplpsqssyvaisapslmeeemkppyetpaggssvnadeflmpqdkiptv 

TIiQDLDPSAMKLQEFNTEGDSEEA* 
>G1073 (62.. 874) 
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CCCCCCGACCTGCCTCTACAGAGACCTGAAGATTCCAGAACCCCACCTGATCAAAAATAA 
CATGGAACTTAACAGATCTGAAGCAGACGAAGCAAAGGCCGAGACCACTCCCACCGGTGG 
AGCCACCAGCTCAGCCACAGCCTCTGGCTCTTCCTCCGGACGTCGTCCACGTGGTCGTCC 
TGCAGGTTCC^AAAACAAACCCAAACCTCCGACGATTATAACTAGAGATAGTCCTAACGT 
CCTTAGATCACACGTTCTTGAAGTCACCTCCGGTTCGGACATATCCGAGGCAGTCTCCAC 
CTACGCCACTCGTCGCGGCTGCGGCGTTTGCATTATAAGCGGCACGGGTGCGGTCACTAA 
CGTCACGATACGGCAACCTGCGGCTCCGGCTGGTGGAGGTGTGATTACCCTGCATGGTCG 
GTTTGACATTTTGTCTTTGACCGGTACTGCGCT^ 

AGGTTTGACGGTGTATCTAGCCGGAGGTCAAGGACAAGTTGTAGGAGGGAATGTGGCTGG 
TTCGTTAATTGCTTCGGGACCGGTAGTGTTGATGGCTGCTTCTTTTGCAAACGGAGTTTA 
TGATAGGTTACCGATTGAAGAGGAAGAAACCCCACCGCCGAGAACCACCGGGGTGCAGCA 
GCAGCAGCCGGAGGCGTCTCAGTCGTCGGAGGTTACGGGGAGTGGGGCCCAGGCGTGTGA 
GTCAAACCTCGAAGGTGGAAATGGTGGAGGAGGTGTTGCTTTCT^ 

TATGAACAATTTTCAATTCTCCGGGGGAGATATTTACGGTATGAGCGGCGGTAGCGGAGG 
AGGTGGTGGCGGTGCGACTAGACCCGCGTTTTAGAGTTTTAGCGTTTTGGTGACACCTTT 
TGTTGCGTITGCGTGTTTGACCTCAAACTACTAGGCTACTAGCTATAGCGGTTGCGAAAT 
GCGAATATTAGGTT 

>G1073 Amino Acid Sequence (domain in AA coordinates: 33-42, 78-175) 
MELNRSEADEAKAETTPTGGATSSATASGSSSGRRPRGRPAGSKNKPKPPTIITRDSPNV 
LRSHVLEVTSGSDI SEAVSTYATRRGCGVC 1 1 SGTGAVTNVTIRQPAAPAGGGVT TLHGR 
FDILSLTGTAIiPPPAPPGAGGIiT\rrLAGGQ 

DRLPIEEEETPPPRTTGVQQQQPBASQSSEVTGSGAQACESNIjQGGNGGGGVAFYNIjGMN 

MNNFQFSGGDIYGMSGGSGGGGGGATRPAF* 

>G1146 (129.. 3095) 

cttctctagcgtcactcttcttcttcattggtcggtagaataaggccaaggaagggatca 
gttttaagttttgtttcattctttttgtagtggagaaaaagagtttttgaaaatcaaaac 
aacaaaaaatgccgattaggcaaatgaaagatagctctgagactcacttagttatcaaaa 
cccaacctttaaagcaccacaatccaaaaaccgttcaaaacggtaaaatccctcctcctt 
ctccttctccggtgacggtgactactccggcgacggttactcagagtcaagcttcttcac 
cttcaccaccgtcaaagaatcgtagccggaggagaaaccgtggtggaagaaaatctgatc 
aaggagatgtttgtatgagacctagctctcgtcctcgtaaaccgccaccgccaagtcaaa 
ccacttcctccgccgtctccgtcgccaccgccggtgagattgtcgctgtgaatcatcaga 
tgcagatgggtgttcgtaaaaactcaaactttgctccaagacctggatttggaacacttg 
gaactaaatgcattgttaaagctaaccactttctcgctgatttgcctaccaaggatttga 
atcagtatgatgttacaattactcctgaagtgtcatcaaagagtgttaacagagctataa 
ttgctgagttagttagactttacaaagagtctgatctcgggaggagacttccggcttacg 
atggccggaaaagtctttacactgctggagaacttccttttacttggaaggagttcagtg 
ttaagattgttgatgaagatgacggtatcatcaatggccctaaaagggagagatcatata 
aggtggcaatcaagtttgttgcacgggcaaatatgcatcacttaggcgagtttctagctg 
gtaaacgggcagattgtccgcaagaggcggtgcagattcttgatattgtactcagggagt 
tgtcggttaagaggttttgtcccgttggaagatctttcttttcgcctgatattaaaacac 
cgcagcgactcggtgaagggttagagtcatggtgtgggttttaccagagtattagaccaa 
ctcaaatgggtttatcactaaatatcgatatggcttcagctgcattcatcgagcctcttc 
cagtgatagagtttgtagcacagcttcttggaaaggatgtcttgtcgaagccattgtcgg 
attctgatcgcgtcaagattaagaagggtcttagaggagtgaaagtagaggttactcaca 
gagcgaatgtaagaaggaaataccgtgttgcgggtttaacaactcaaccaacaagagagc 
taatgtttccagtagatgagaactgtactatgaagtcagttattgagtatttccaagaga 
tgtatggattcacgatccagcacacgcatttgccatgtctccaagttggaaaccaaaaga 
aggcaagctatttgccgatggaggcatgcaaaattgtcgagggacaacggtacacgaaaa 
ggttgaatgagaagcagattactgctctcttgaaagttacatgccaaagggccgagggac 
agagaaacgatattttgcggactgtccaacacaacgcatatgatcaagatccatatgcaa 
aggagtttggcatgaacataagcgaaaagttagcttctgttgaagctcgtattcttccag 
ctccatggcttaagtatcacgagaacgggaaagaaaaagattgtctcccgcaagttggtc 
agtggaatatgatgaacaagaaaatgatcaacgggatgactgtgagcagatgggcctgtg 
ttaacttctcacgcagcgttcaagaaaacgttgctcgtggattttgtaatgaacttggtc 
agatgtgtgaagtctcaggcatggagtttaatccagaacccgtgataccaatatatagtg 
cgaggcccgatcaagtcgagaaagctctaaagcatgtttatcacacttcaatgaacaaaa 
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ccaaaggcaaagagttagagcttctgctggcaatattacctgataacaacggttcacttt 

atggtgatcttaagagaatctgtgaaaccgagcttggtttgatatctcaatgttgtctca 

caaaacatgtgttcaagattagcaaacagtatctggcagatgtatcccttaaaatcaacg 

taaagatgggaggaaggaacacagttctagtagacgccataagctgtagaattccactgg 

ttagcgatataccgacaatcatttttggcgcagacgtgactcacccagagaacggggaag 

agtcaagcccttcaatcgctgctgttgttgcttctcaagactggcctgaagtgacaaaat 

atgcgggtttagtttgtgctcaagctcacaggcaagaacttatacaagatttgtataaaa 

catggcaagatcctgttcgcggtactgttagtggcggtatgatcagggaccttcttatct 

catttagaaaagcaacagggcaaaaaccgcttcgaattatcttttatcgtgatggagtaa 

gcgaagggcaattctatcaagttttactctatgagttggatgcaattcgaaaggcttgtg 

catcgcttgaaccgaattatcagccaccggtgacattcatagttgtacagaagcgtcacc 

acactcgtttgtttgctaataatcaccgagacaaaaacagtactgaccgaagcggaaata 

tcttaccaggtactgtagttgacactaaaatatgtcatccaactgaattcgacttctacc 

tttgtagccatgcgggtattcagggaacaagcaggcctgcacattaccatgttctttggg 

acgagaacaatttcacagcagatggtattcaatctctgactaacaatctctgttatacct 

atgcgcggtgcactcggtcggtctctatagttcctccagcgtattatgctcatcttgcag 

catttcgagcacgtttctacctggaacctgagataatgcaagacaacggatcaccgggta 

aaaagaacacgaaaacaacaactgtcggagacgtaggtgtgaagcctttaccagccttga 

aggagaatgtgaagagagtaatgttctactgctaaaaatccaaacattccttaatcagtt 

ttaataagtagtttggttgtttgcttgtagttcggctttagatttaccaatgtttttctt 

atgtaaattttgtcggtttggtttaagcctttaggaattagtgtattagggtttttctaa 

agttgtactttagctgatgataacgttgatgcagtgactttgttaaaacctcctcttcta 

cagtagtgtttacgtcgttcctc 

>G1146 Amino Acid Sequence (domain in AA coordinates: 886-896) 
MPIRQMKDSSETHLVIKTQPIjKHHNPKTVQ^ 

PSKNRSRRRNRGGRKSDQGDVCMRPS SRPRKPPPPSQTTS S AVS VATAGE I VAVNHQMQM 
GVRKNSWFAPRPGFGTLGTKCI VKANHFLADLPTKDLNQ YDVT ITPEV^ S KS VNRAI IAE 
LVRLYKESDIiGRIOiPAYDGRKSIjYTAGELPFTWKEFSVKITO 

IKFVARANMHHLGEFLAGKRADCPQEAVQ ILD IVIjRELS VKRFCPVGRS FFS PD IKTPQR 
LGEGIiESWCGFYQSIRPTQMGLSIiNIDMASAAFIEPIiPVIEFVAQLLGKDVIiSKPIiSDSD 
RVKIK3CGLRGVKVEVTHRANVRRKYRVAGLTTQPTRE 

FTIQHTHLPCLQVGNQKKASYLPMEACKIVEGQRYTKRIjNEKQITAIjLKTOCQRAEGQRIT 
D I LRTVQHNAYDQDPYAKEFGMNI S EKLASVEARILPAPWLKYHENGKEKD CLPQVGQWN 
MMNKKMINGMTVSRWACVNFSRSVQENVARGFCNEL^ P I YSARP 

DQVEKALKHVYHTSMNKTKGKEIiELLIjAILPDl^GSLYGDLKRICETELGLISQCCIiTK^ 
VFKI SKQYLADVS LKINVKMGGRNTVIiVDAI SCRI PLVSDI PTI I FGADVTHPENGEE S S 
PS IAAVVASQDWPEVTKYAGL VCAQAHRQEL I QDLYKTWQDPVRGTVSGGMIRDLL I S FR 
KATGQKPLRI IFYRDGVS EGQFYQVLLYELjDAIRKACASLEPNYQPPVTF I WQKRHHTR 
LFANtraRDKNSTDRSGNILPGTVVDTKI 

NFTADGIQSIiT^LCYTyARCTRSVSIVPPAYYAHIiAAFRARFYLEPEIMQDNGSPG 

TKTTTVGDVGVKPLPALKENVKRVMFYC* 

>G1267 (152.. 967) 

AAGTAGAGAATAATAATCACATCAAGATTGTTTATAACCCTCCCCl^AATCACCTTCTTA 
NTNACCACCCTCTCCGGCTCTCAACAGAACAACAACAAAAAAACAGCTTCCGTTGTCCTG 
TTCCGGCGAAATCGGACGGTCGAGATCAATCATGCATCGTAGAGCAGCAATTCAAGAATC 
GGATGACGAAGAAGATGAGACTTACAACGACGTCGTTCCTGAATCTCCTTCTTCTTGTGA 
AGACTCAAAGATCTCAAAACCAACTCCAAAGAAAAGGAGGAACGTAGAGAAGAGAGTTGT 
CTCAGTTCCGATAGGTGACGTGGAAGGATCTAAGAGCAGAGGCGAAGTATATCCACCGTC 
CGATTCATGGGCCTGGAGAAAGTACGGACAAAAACCGATCAAAGGCTCGCCTTATCCCAG 
GGGATATTACAGATGTAGTAGCTCAAAAGGATGTCCGGCGAGGAAGCAGGTGGAGAGAAG 
CCGTGTGGACCCTTCTAAGCTTATGATTACTTACGCCTGCGACCACAATCACCCTTTCCC 
TTCCTCCTCCGCTAACACCAAATCCCACCACCGCTCCTCCGTCGTCCTCAAAACCGCAAA 
GAAAGAGGAAGAATACGAAGAGGAGGAAGAAGAACTAACCGTCACCGCCGCAGAGGAACC 
ACCGGCGGGACTTGATCTAAGCCACGTAGACTCACCGTTGCTATTAGGCGGCTGCTACAG 
CGAAATCGGAGAGTTCGGGTGGTTCTACGACGCGTCGATOTCATCATCATCTGGTTCTTC 
GAATTTCCTCGACGTAACTCTAGAGAGAGGTTTTTCAGTAGGCCAAGAGGAAGATGAGTC 
TTTGTTCGGTGATCTCGGTGATTTACCTGATTGCGCCTCCGTGTTCCGCCGTGGGACTGT 
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TGCGACGGAGGAGCAACATCGAAGATGTGATTTTGGCGCCATTCCTTTCTGTGATAGTTC 

TAGATGAGTTTGTGTGTGTAGCCAAAACCAAAAGAAAAAAAGAC 

ACTGTAAAGGTGTATCAATGGTGGATTCATTTTTTTAAAAAAAAAAAAAAAA 

>G1267 Amino Acid Sequence (domain in AA coordinates: 70-127) 

MHRRAAIQESDDEEDETYNDVVPES PS SCEDSKI S KPTPKKRRNVEKRVVS VPI ADVEGS 

KSRGEVYP PSDS WAWRKYGQKP I KGS PYPRGYYRC S S S KGCPARKQVERSRVDPSKLMIT 

YACDHNHPFPSSSANTKSHHRSSVVIiKTAKKEEEYEEEEEE 

SPIiLIiGGCYSEIGEFGWFYDASISSSSGSSNFLDVTLERGFSVGQEEDESLFGDLGDLPD 

CASVFRRGTVATEEQHRRCDFGAIPFCDSSR* 

>G1269 (88.. 951) 

AACAATTCTCTCTCTCTTTATTCTTCTTCTTCAGCTTCAGATTTCAGATCTTAAATCTTC 

AAGTCTTCTTCTTCTTCTTCTGCAACCATGGCTATGCAGGAACGTTGTGAGAGTT^ 

TCTGATGAACTTATATCTTCCTCAGATGCCTTTTACCTCAAGACAAGAAAGCCTTATACC 

ATCACTAAACAAAGAGAGAAATGGACAGAAGCAGAGCATGAGAAGTTTGTAGAAGCATTG 

AAACTCTATGGCAGAGCTTGGAGACGAATCGAAGAACATC 

CAGATTCGAAGCCATGCGCAGAAGTTCTTTACTAAGGTTGCTCGCGATTTTGGTGTTAGC 
TCTGAGTCCATTGAGATCCCGCCTCCAAGGCCAAAGAGAAAGCCGATGCATCCTTACCCT 
AGAAAGCTTGTGATTCCTGATGCAAAAGAGATGGTATACGCTGAACTAACCGGATCCAAG 
CTGATTCAGGATGAAGATAACCGATCTCCAAGATCG 

GGATTAGGTTCCATTGGTTCAAATTCACCTAACTCTTCTTCAGCTGAGTTATC^TCTCAC 
ACAGAGGAATCATTGTCTCTAGAAGCAGAGACCAAACAGAGCCTTAAGCTCTTTGGAAAA 
ACTTTTGTAGTTGGTGATTACAACTCTTCAATGAGTTGTGATGATTCTGAAGATGGCAAG 
AAGAAGCTATACTCAGAAACACAGTCTCTTCAATGTTCTTCTTCTACTTCAGAAAACGCT 
GAAACAGAAGTGGTAGTGTCGGAGTTCAAAAGAAGTGAGAGATCAGCTTTCTCTCAGTTA 
AAATCGTCGGTGACTGAGATGAACAAC^TGAGAGGGTTCATGCCTTACAAAAAGAGAGTA 
AAGGTGGAAGAAAACATTGACAATGTAAAATTATCATATCCTTTGTGGTGAAGTGTTCGT 
TTGTGTC^GTC^GTTGTGTAAACTCTTTTGATCTC^CATCAGATTATGTGTATAATG^ 
CAGAGTATTAGGGAAAGTTTTTTTGGATTAGATTCGTAAGATCACTCCAAAGTTTCGTGT 
CTTTC(^TATAACCAGTTAGAAATTGAGATCCTTGTACTTAAACATTTTTATTTGAT^ 
TCAAATCTTCTTGATGAAAAAAAAAA 

>G1269 Amino Acid Sequence (domain in AA coordinates: 27-83) 
MAMQERCESLCSDELISSSDAFYLKTRKPYTITKQ^ 

IEEHVGTKTAVQIRSHAQKPFTKVARDFGVSSESIEIPPPRPKRKPI^PYPRKLVIPDAK 
EMVYAELTGSKLjIQDEDNRSPTSVIjSAHGSDGIjGS IGSNSPNS S SAELS SHTEESLSLEA 
ETKQSLKLFGKTFWGDYNSSMSCDDSEDGKKKLYSETQSLQCSSSTSENAETEVVVSEF 
KRSERSAFSQLKSSVTEMNNMRGFMPYKKRVKVEENIDNVICLSYPLW* 
>G1452 (175.. 1296) 

ATTTATTAAGCATCAATGAGAGAACTTCAGAGCTGGGTTTGAGTTCTGTCCAATAATACA 

TAACCACGTTATCATTTTTGTCCTTTACTATCTCATTACACTCTTCTGTTATTCGCCCAA 

TTCTTACAGTCATTACTCTCTATAGGGCTCGAGCGGCCGCCCGGGCAGGTTTCTATGCAG 

ATGGTTCACACTTCCCGCTCCATTGCCCAGATTGGGTTCGGTGTTAAGTCGCAATTAGTA 

CTC^CTATAGGGCTCGAGCGGCCGCCCGGGCAGGTAATVAGATCAAACAATGTCTAAAGAA 

GCTGAGATGTCGATCGCGGTGTCGGCTTTGTTCCCTGGTTTTAGATTCTCTCCTACTGAT 

GTTGAACTTATCTCGTACTATCTTCGTCGTAAAATCGATGGTGATGAGAACTCTGTTGCT 

GTGATTGCTGAGGTCGAGATTTACAAGTTCGAGCCGTGGGACTTGCCAGAGGAATCGAAA 

CTGAAATCGGAGAACGAGTGGTTTTACTTCTGCGCGAGGGGGAGGAAGTACCCGCACGGG 

TC^CAAAGCCGGCGAGCCACACAGCTAGGATATTGGAAAGCGACCGGTAAAGAGCGGAGT 

GTTAAATCCGGGAACCAAGTTGTTGGAACCAAGAGAACGCTTGTATTTCATATCGGTCGG 

GCTCCTCGTGGCGAGAGAACGGAGTGGATTATGCATGAATACTGCATCCATGGAGCCCCA 

CAGGATGCATTAGTGGTGTGCCGGTTAAGAAAAAATGCTGATTTTCGGGCTAGTTCGACC 

CAAAAAATTGAGGATGGTGTTGTGCAAGACGATGGCTACGTTGGCCAAAGAGGTGGTTTG 

GACAAGGAGGACAAATCCTACT^ATGAATCTGAGCATCAGATACCAAATGGl^ 

GAATCATCAAATGTTGTTGAGGATCAGGCCGATACCGATGATGATTGTTACX3CCGAGATT 

CTGAACGATGATATAATAAAGCTCGACGAAGAAGCGTTGAAAGCTAGCCAAGCGTTTCGA 

CCAACTAATCCAACTCATCAAGAAACAATATCAAGCGAGTCATCGAGTAAGAGGTCAAAA 

TGTGGTATAAAAAAAGAATCAACGGAAACAATGAAT^^ 

AACGTTGCCGGAACCGACTCCAGCTGGAGATTCCCGAACCCGTTCAAAATCAAGAAAGAT 
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GATAGCCAGAGATTGATGAAGAATGTTCTGGCCACTACTGTTTTCTTGGCTATCTTATTT 

TCTTTCTTTTGGACTGTATTAATAGCTAGGAACTAAAGCTAGTTACGACATACATATTAT 

TTATAC^TAAATAAATATAGTATTTTGTCTATGGCAAAT^AAAAAAAAAAAA 

>G1452 Amino Acid Sequence (domain in AA coordinates: 30-177) 

MQMVHTSRS IAQI GFGVKSQIjVIjTIGLERPPGQVKDQTMSKEAEMS IAVSALFPGFRFS P 

TDVELISYYIiRRKIDGDENSVAVIABVEIYKFEPVTOLPEESKIiKSENEWFYFCARGRKyP 

HGS QSRRATQLGYWKATGKERS VKSGNQVVGTKRTLVFHI GRAPRGERTEW I MHE YC IHG 

APQDALVVCRIiRKNADFRAS STQKIEDGVVQDDGYVGORGGLDKEDKS YYES EHQ I PNGD 

IAESSNVVEDQADTDDDCYAEIIiNDDIIKLDEEALKASQAFRPTNPTHQETISSESSSKR 

SKCGIKKSSTETMNCYALFRIKNVAGTDSSWRFPNPFKI KIODDSQRIjMKNVLiATTVFIjAI 

LFSFFWTVLIARN* 

>G1494 (114. .1406) 

TCGACAGAGTTGTGTTGGGCGTGGAACTTGGACTAGTTCCACATATCAGGTTATATAGAT 
CTTCTCTTTCAACTTCTGATTCGTCCAGAAGCTTTCCTAATCTGAGATCTGACATGGAAC 
ACCAAGGTTGGAGTTTTGAGGAGAATTATAGTTTGTCCACTAATAGAAGATCTATCAGGC 
CACAAGATGAACTTAGTGG AGTTATTATGG CGAGATGGACAAGTGGTTCTGCAG^ 
CTCATAGAGAACAAACCCAAACCCAGAAACAAGATCATCAT^ 

GGACCTTTCTTGAAGATCAAGAAACTGTCTCTTGGATC CAATACCCTCCAGATG AAGAC C 
CATTCGAACCCGACGACTTCTCCTCCra^^ 

CAACCT(^GAGACGGTTAAGCCTAAGTCCAGTCCTGAACCTCCTCAAGTCATGGTTAAGC 
CTAAGGCCTGTCCTGACCCTCCTCCTCAAGTCATGCCTCCTCCAAAATTTAGGTTAACAA 
ATTCATCATCGGGGATTAGGGAAACAGAAATGGAACAGTACTCGGTAACGACCGTTGGAC 
CTAGCCATTGCGGAAGCAACCCATCACAGAACGATCTCGATGTCTCAATGAGTCATGATC 
GAAGCAAAAACATAGAAGAAAAGCTTAATCCGAACGCAAGTTCCTCATCAGGTGGCTCCT 
CTGGTTGGAGCTTTGGCAAAGATATCAAAGAAATGGCTAGTGGAAGATGCATCACAACCG 
ACCGTAAGAGAAAAGGTATAAATCACACTGACGAATCTGTATCTCTATCAGATGCAATCG 
GTAACAAGTCGAACC^CGATCAGGATCAAACCGAAGGAGTCGAGCAGCTGAAGTTCATA 
ATCTCTCCGAAAGGAGGAGGAGAGATAGGATCAATGAGAGAATGAAGGCTTTGCAAGAAC 
TAATACCTCACTGCAGTAAAACTGATAAAGCTTCGATTTTAGACGAAGCCATAGATTATT 
TGAAATCACTTCAGTTACAGCTTCAAGTGATGTGGATGGGGAGTGGAATGGCGGCGGCGG 
CGGCTTCGGCTCCGATGATGTTCCCCGGAGTTCAACCTCAGCAGTTCATACGTCAGATAC 
AGAGCCCGGTACAGTTACCTCGATTTCCGGTTATGGATCAGTCTGCAATTCAGAACAATC 
CCGGTTTAGTTTGCCAAAACCCGGTACAAAACCAGATCATCTCCGACCGGTTTGCTAGAT 
ACATCGGTGGGTTCCCACACATGCAGGCCGCGACTCAGATGCAGCCGATGGAGATGTTGA 
GATTTAGTTCACCGGCGGGACAGCAAAGTCAACAACCGTCGTCTGTGCCGACGAAGACCA 
CCGACGGTTCTCGTTTGGACCACTAGGTTGGTGAGCCACTTTGC 

>G1494 Amino Acid Sequence (domain in aa coordinates: 261-311) 
MEHQGWSFEENYSLSTNRRS IRPQDELVELLWR^ 

RSSTFLEDQETVSWIQYPPDEDPFEPDDFSSHFFSTMDPLQRPTSETVKPKSSPEPPQVM 
VKPKACPDPPPQVMPPPKFRLTNSSSGIRETEMEQYSVTTVGPSHCGSNPSQNDLDVSMS 
HDRS KNT EEKLNPNAS S S SGGS S GCS FGKD I KEMASGRC ITTDRKRKRINHTDE S VS LSD 
AI GNKSNQRSGSNRRS RAAEVHNI*SERRRRDR INERMKALQEIi I PHCS KTD KAS I LDEAI 
DYLKSLQLQLQVMWMGSGMAAAAASAPMMFPGVQPQQFIRQIQSPVQLPRFPVMDQSAIQ 
NNPGLVCQNPVQNQIISDRFARYIGGFPHMQAATQMQPMEMLRFSSPAGQQSQQPSSVPT 

KTTDGSRLDH* 
>G1548 (1..2511) 

ATGGCAATGTCTTGCAAGGATGGTAAGTTGGGATGTTTGGATAATGGGAAGTATGTGAGG 
TATAC^CCT^AAC^GTTGAAGC^CTTGAGAGGCTTTATCATGACTGTCCTAAACCGAGT 
TCTATTCGCCGTCAGCAGTTGATCAGAGAGTGTCCTATTCTCTCTAACATTGAGCCTAAA 
GAGATGAAAGTGTGGTTTCAGAACCGAAGATGTAGAGAGAAACAAAGGAAAGAGGCTTC^ 
CGGCTTCAAGCTGTGAATCGGAAGTTGACGGCAATGAACAAGCTCTTGATGGAGGAGAAT 
GAGAGGTTGCAGAAGCAAGTGTCACAGCTGGTCGA.TGAAAACAGCTACTTCCGTCAACAT 
ACTCCAAATCCTTCACTCCCAGCTAAAGACACAAGCTGTGAATCGGTGGTG ACGAG TGGT 
CAGC^CCAATTGGCATCTCAAAATCCTCAGAGAGA 

ATTGCAGAAGAAACTTTAGCAGAGTTTCTTTCAAAGGC^CTGGAACCGCTGTTGAGTGG 
GTTCAGATGCCTGGAATGAAGCCTGGTCCGGATTCCATTGGAATCATCGCTATTTCTCAT 
GGTTGCACTGGTGTGGCAGCACGCGCCTGTGGCCTAGTGGGTCTTGAGCCTACAAGGGTT 
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GCAGAGATTGTCAAGGATCGTCCTTCGTGGTTCCGCGAATGTCGAGCTGTTGAAGTTATG 
AACGTGTTGCCAACTGCCAATGGTGGAACCGTTGAGCTGCTTTATATGCAGCTCTATGCA 
CCAACTACATTGGCCCCACC^CGCGATTTCTGGCTGTTACGTTAC^CCTCTGTTTTAG^ 
GATGGCAGCCTTGTGGTGTGCGAGAGATCTCTTAAGAGCACTCAAAATGGTCCTAGTATG 
CCACTGGTTCAGAATTTTGTGAGAGCAGAGATGCTTTCCAGTGGGTACTTGATACGGCCT 
TGTGATGGTGGTGGCTCAATCATACACATAGTGGATCATATGGATTTGGAGGCTTGTAGC 
GTGCCTGAGGTCTTGCGCCCGCTCTATGAGTCACCCAAAGTACTTGCACAGAAGACAACA 
ATGGCGGCACTGCGTCAGCTCAAGCAAATAGCTCAGGAGGTTACTCAGACTAATAGTAGT 
GTTAATGGGTGGGGACGGCGTCCTGCTGCCTTAAGAGCTCTCAGCCAGAGGCTAAGCAGA 
GGCTTCAATGAAGCTGTAAATGGTTTCACTGATGAAGGATGGTCAGTGATAGGAGATAGC 
ATGGATGATGTC^CAATCZACTGTAAACTCITCTCCAGACAAGCTAATGGGTCTAAATCTT 
ACATTTGCCAATGGCTTTGCTCCTGTAAGCAATGTTGT^ 

CTTTTACAGAATGTTCCTCCGGCGATCCTGCTTCGGTTTCTGAGGGAGCATAGGTCAGAA 

TGGGCTGACAACAAC^TTGATGCGTATCTAGCAGCAGCAGTTAAAGTAGGGCCTTGTAGT 

GCCCGAGTTGGAGGATTTGGAGGGCAGGTTATACTTCCACTTGCTCATACTATTGAGCAT 

GAAGAGTTTATGGAAGTCATGAAATTGGAAGGTCTTGGTCATTCCCCTGAAGATGCAATC 

GTTCCAAGAGATATCTTCCTTCTTCAACTTTGTAGCGGAATGGATGAAAATGCTGTAGGA 

ACCTGTGCGGAACTTATATTTGCTCCAATCGATGCTTCGTTTGCGGATGATGCACCTCTG 

CTTCCTTCTGGTTTTCGTATTATCCCTCTTGATTCCGCAAAGGAA 

CGAACCTTGGATCTTGCTTCGGCACTGGAAATTGGTTC^ 

GATCAATCAGGAAACTCCACATGTGCAAGATCTGT 

ATCGAGAGCCATATGCAAGAACATGTAGCATCCATGGCTAGGCAGTATGTTCGAGGTATC 

atatcatcggtgcagagagtagcattggctctttctccttctcatatcagctcacaagtt. 
ggtctacgcactcctttgggtactcctgaagcccaaacact'tgctcgttggatttgccag 
agttacaggggctacatgggtgttgagctacttaaatcaaacagtgacggcaatgaatct 
'attcttaagaatctttgggatcacactgatgctataatctgctgctcaatgaaggccttg 
cccgtcttcacatttgcaaaccaggcg 

cttcaagacatctctttagagaagatatttgatgacaatggaagaaagactctttgctct 
gagttcccacagatcatgcaacagggcttcgcgtgccttcaaggcgggatatgtctctca 
agcatggggagaccagtttcgtatgagagagcagttgcttggaaagtactcaatgaagaa 
gaaaatgctcattgcatctgctttgtgttcatcaattggtcctttgtgtga 

>G1548 Amino Acid Sequence (domain in AA coordinates: 17-77) 
MAMSCKDGKLGCLDNGKYVRYTPEQVKALERLYHDCPKPSSII^QQIjIRECPILSNIEPK 
QIKVWFQNRRCREKQRKEASRLQAVmKLTAMNKLLMEE^ 

TPNPSLP AKDTS CES WTSGQHQIiAS QNPQRD AS PAGLLS I AEETLAEFLS KATGTAVEW 
VQMPGMKPGPDSIGIIAISHGCTGVAARACGLVGLEPTRVAEIVKDRPSWFRECRAVEVM 

NVLPTANGGTVELLYMQLYAPTTJ^ 

PLVQNFVRAEMLSSGYLIRPCDGGGSIIHIVDHMDLEACSVPEVLRPLYES 

MAALRQIiKQIAQEVTQTNSSVHGWGRRPAALRALSQRIjSRGFNEAVNGFTDEGWSVI 

ITODVTITWSSPDKLMGIiNIiTFANGF 

WADNNIDAYIiAAAVKVGPCSARVGGFGGQVI LPIiAHTIEHEE FMEVI KLEGLGHS PEDAI 
VPRDIFLLQLCSGMDENAVGTCAEIjIFAPipASFADDAPLLPSGFRIIPLDSAKEVSSPN 
RTLDLASAIiEIGSAGTKASTDQSGNST(^tf*SV^^ 

ISSVQRVAIiALSPSHISSQVGLRTPIjGTPEAQTIjARWICQSYRGYMGVEIiLKSNSDGNES 
ILKNLWHHTDAIICCSMKALPWTFANQAGL^ 

EFPQIMQQGFACLQGGICLSSMGRPVSYERAVAWKVia^EENAHCICFVFINWSFV* 
>G1574 (1..1962) 

ATGGATGATACAAT^GACATGAGTTCAGGTAGTGATGAAGAAGTACAAGAAGAGAAGACC 
ACTGTTAACGAGAGGGTCATCTATCAGGCTGC^TTACAAGATCTGAAGC^CCCAAGACC 
GAAAAGGATCTACCTCCTGGTGTTCTTACAGTTCCrCTTATGAGGCATC^GAAAATTGCA 
TTGAACTGGATGCGTAAGAAAGAAAAAAGAAGCAGGCACTGTTTGGGAGGGATATTAGCA 
GATGATt^GGGACTTGGTAAAACGATCT'CGACGATCTCTCTTATCCTGTTACAAAAGTTG 
AAGTCACAATCTU^AGCAGAGAAAGCGAAAAGGTCAAAACTCTGGTGGTACATTGATTGTT 
TGTCCAGCAAGTGTTGTAAAACAATGGGCAAGAGAAGTTAAAGAGAAGGTTTCTGATGAA 
CACAAACTCTCTGTTTTAGTCCACCATGGATCT 

GCAATATATGATGTGGTCATGACAACTTACGCCATTGTTACyVAATGAAGTTCCACAAAAC 
CCTATGCTGAATCGTTATGATAGTATGAGAGGCAGAGAAAGCCTTGACGGATCGAGTTTG 
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ATTCAGCCTCACGTTGGTGCACTAGGAAGAGTTAGGTGGTTGAGAGTAGTATTAGATGAA 

GCTCATACAATTAAAAACCATAGAACCCTAATTGCAAAAGCTTGTTTTAGCCTTAGAGCC 

AAAAGGAGATGGTGTTTGACTGGAACGCCGATAAAGAACAAAGTAGACGATCTTTATAGC 

TATTTCAGATTTCTTAGATATC^TCCATATGCCATGTGCAATTCATTTCACCAAAG 

AAAGCTCCAATTGATAAAAAGCCTCTTCATGGTTACAAGAAGCTTCAAGCTATTCTAAGG 

GGTATAATGTTGCGCCGCACCAAAGAATGGTCTTTCTACAGGAAGCTTGAATTGAATTCA 

CGTTGGAAGTTTGAGGAATATGCTGCTGATGGGACTTTGCATGAAC^ 

TTGGTGATGCTTTTGCGACTACGCCAAGCTTGTAAC(^TCCAC^^ 

AGTCACTCAGATACTACAAGAAAAATGTCAGATGGAGTTCGAGTAGCCCCTAGAGAGAAT 
CTAATCATGTTCCTCGATCTCTTGAAATTATCCTCAACCACCTGCTCTGTTTGTAGTGAT 
CC^CCAAAAGACCCTGTTGTTACTTTGTGTGGCCATGTGTTTTGTTATGAGTGTGTGTCT 
GTAAACATTAACGGGGATAACAATACGTGCCCTGCACTTAATTGCCACAGCCAGCTTAAA 
CATGATGTTGTTTTCACTGAATCTGCAGTTAG AAGTTG CATCAACGATTATGATGATC CT 
GAAGATAAAAATGCTTTAGTTGCATCAAGGCGAGTTTATTTCATCGAAAATCCGAGCTGT 
GATAGAGATTCTT(^GTCGCTTGCAGAGCAAGGCAGTCCAGACACTCC^CCAATAAAGAC 
AATAGTATCAGTGGACTGAATCTCATTTTTACGTTTCTCJ^ 

GAAACAGGTGCGATGTTGATGTCTCTTAAAGCTGGAAACCTTGGATTGAATATGGTAGCT 

GCAAGTCATGTCATTCTACTGGACCTATGGTGGAATCCAACAAGAGAGGATCAAGC 

GATCGAGCTGATCGTATCGGACAAACTCGAGCTGTTACGG 

AATACCGTTGAGGAACGAATTTTGACTCTTCATGAACGTAAAAGGAACATTGTTGCATCT 
GGATTGGGTGAAAAAAACTGGCAAAAX3TTC TG CGATTC AACTAACACTAGAAG AT CT CG A 
ATATCTGTTTTTTGGTGTGTAGAATATCCCAGAGTTTTTATTGATAAGAGGAATAAAACC 
TTTAGCTATTTAATAAGTCACAAGTGTGAATGTAATGAATAA 

>G1574 Amino Acid Sequence (domain in AA coordinates: 28-350) 

MDDTMDMSSGSDEEVQEEKTTVNERV^ 

LNWMRKKEKRSIUICLGGIIjADDQGLGKTISTI 

C P AS WKQ WAREVKE KVSDEHKLS VL VHHG S HRTKD PTE IAI YDWMTTYAI VTNE VP QN 
PMUSJRYD SMRGRE S LDGS SIjI QPHVG ALGRVRWLRVVIiDEAHT I KNHRTL I AKAC F S LRA 
KRRWCLTGTP IKNKVDDLYS YFRFLRYHPYAMCNS FHQRIKAP IDKKPLHGYKKLQAIIjR 
GIMLRRTKEWSFYRKLELiNSRWKPEEYAADGTIiHEHI^ 
SHSDTTRKMSDGWVAPRENIilMFLDLLKLSSTTC 

VNINGDNNTCPALNCHSQIjKHDVVFTE s avrs cihdyddpedknalvasrrvyfienp sc 
DRDSSVACRARQSRHSTJsTKDNSISGLNIjIFTFLKDKCNDYET 

ashvilldlwwpttedqaidrahrigqt^ 

algeknwqkfcdstntrrsr i s vfwcve yprvf idkrnktfs yl i shrce cne * 

>G1586 (1..807) 

ATGAATCAAGAAGGTGCTTCACATAGCCCATCCTCCACTTCCACCGAACCAGTCCGGGCA 
CGTTGGTCACCTAAACCGGAGCAAATCTTGATACTCGAATCCATCTTCAACAGTGGTACT 
GTTAACCCACCAAAAGATGAAACGGTGAGGATAAGAAAGATGCTTGAGAAATTCGGTGCT 
GTGGGAGACGCAAACGTCTTCTACTGGTTTCAAAACCGACGGTCAAGATCTCGCCGGAGA 
CACCGGCAGCTTTTAGCAGCCACCACCGCAGCCGCCACCTCCATAGGAGCTGAAGACCAC 
CAGCACATGACGGCCATGAGCATGCATCAAT^ 

GGGTTTGGAAGTTGTAGCAACTTATCAGCTAATTACTTCCTTAATGGATCGTCGTCATCT 
CAAATCCCTTCCTTTTTCCTCGGCCTCTCTTCTTCAAGTGGTGGGTGTGAGAAC?UICAAT 
GGTATGGAGAATCTCTTCAAAATGTATGGCCATGAATCTGATCATAATCATCAGCAGCAG 
CATCATAGCTC7VAATGCTGCATCAGTTTTAAACCCATCTGATCAAAACTCCAACTCCCAA 
TACGAACAAGAAGGGTTTATGACGGTGTTTATAAACGGAGTTCCTATGGAAGTAACAAAA 
GGAGCAATAGACATGAAAACAATGTTCGGTGATGATTCGGTGTTACTTC^TTCCTCTGGT 
CTTCCTCTTCCC^CTGATGAGTTTGGTTTCTTGATGCATTCTTTACAACATGGACAAACT 
TATTTCCTGGTACCGAGACAGACATGA 

>G1586 Amino Acid Sequence (domain in AA coordinates: 21-81) 

MNQEGASHSPSSTSTEPVRARWSPKPEQILIIjESIFNSGTVNPPKDETVRIRKMLEKFGA 

VGDANVFYWFQNRRSRSRRRHRQLIiAATTAAATS I GAEDHQHMTAMSMHQYPCSNNE IDL» 

GFGSCSNLSANYFLNGSSSSQIPSFFLGLSSSSGGCENNNGMEl^FKMYGHESDHNHQQ 

HHSSNAASVLNPSDQNSNSQYEQEGFMTVFINGVPMEVTKGAIDMKTMFGDDSVL 

LPIjPTDBFGFLMHSIiQHGQTYFIiVPRQT* 

>G1786 (1..1170) 



43 



»] 



WO 03/013227 PCT/US02/25805 

44/286 



ATGATCGTGTACGGTGGGGGAGCATCCGAGGACGGTGAAGGTGGAGGGGTGGTTCTGAAG 
AAAGGGCCATGGACGGTGGCCGAGGACGAGACACTGGCGGCTTACGTACGGGAATACGGT 
GAAGGGAACTGGAATTCTGTTCAGAAGAAGACATGGCTGGCTAGGTGTGGCAAGAGCTGC 
CGCCTCCGCTGGGCTAACCACTTACGACCTAATCTCAGGAAAGGCTCCTTCACCCCCGAG 
GAAGAACGTOTCATCATACAACTCCACTCTCAGCTAGGCAACAAATGGGCTCGCATGGCT 
GCTCAGTTACCAGGCAGAACAGATAACGAGATCAAGAACTACTGGAACACGAGGTTGAAA 
CGCTTCCAACGCCAAGGCCTCCCTCTCTACCCTCCAGAATATTCCCAAAACAATCATCAA 
CAACAAATGTATCCTCAACAGCCCTCCTCACCTCTCCCGTCCCAAACACCTGCTTCTTCC 
TTTACCTTTCCTCTCCTCCAACCGCCTTCTCTGTGTCCCAAACGTTGTTATAACACTGCC 
TTCTCTCCCAAGGCCTCATATATTTCTTCTCCAACCAATTTCCTTGTCTCGTCTCCGACC 
TTTCTTGACACCCATTCCTCTCTTTCCTCCTATC^GTCTACCAATCCGGTTTACTCCATG 
AAACATGAGCTCTCTTCAZVACCAAATTCCATACTCTGCCT 

AGCAAGTTCTCAGACAATGGGGATTGTAACCAAAACCTGAACACCGGTTTGCATACAAAT 

ACCTGTCAGCTGTTAGAGGATCTTATGGAGGAGGCCGAGGCTCTAGCTGATAGCTTTCGT 

GCTCCTAAGCGGAGACAAATCATGGCTGCGCTTGAGGACAACAACZUVCAAGAACAACTTT 

TTCTCGGGAGGTTTCGGACATCGTGTTTCTTCCAACAGTCTATGTTCCTTGCAAGGTTTA 

ACACCAAAGGAAGATGAGTCTCTCCAGATGAACACAATGCA^ 

CTTCTTGACTGGGGAAGTGAAAGTGAAGAAATCT 

ACAGAGAACAACCTTGTCCTTGACGATCACCAGTTCGCTTTTCTGTTTCCAGTTGATGAT 
GACACCAACAACTTGCCAGGGATCTGCTAG 

>G1786 Amino Acid Sequence (domain in AA coordinates: TBD) 
MIVYGGGASEDGEGGGVVI,KKGPVm7AEDETI*AAY^ 

RliRWANHLRPNLRKGS FTPEEERL 1 1 QLHS QLGNK^JARMAAQLPGRTDNEI K3SnTWNTRIjK 
RFQRQGLPLYPPEYSQl^QQQMYPQQPSSPLPSQTPASSFTFPLIiQPPSIiCPKRCrraTA 
FSPKASYISSPTNFLVSSPTFLHTHSSLSSYQSTNPVYSMKHELSSNQIPYSASIiGWQV 
SKFSDNGDCNQNl^NTGIiH^^ 

FSGGFGHRVSSWSLCSLQGLTPKEDESLQMOTMQDEDITKLIiDWGSESEEISNGQSSVIT 
TENNLVLDDHQFAFIjFPVDDDTNNIjPGI c * 
>G1792 (77.. 496) 

AATCCATAGATCTCTTATTAAATAAC^GTGCT 

TAGAACACCA^GTTAATGGAGAGCTCATVACAGGAGCAGGAACAACCAAT 

CAAGCAAGCTCGTTTCCGGGGAGTTCGAAGAAGGCCTTGGGGAAAGTTTGCAGCAGAGAT 

TCGAGACCCGTCGAGAAACGGTGCCCGTCTTTGGCTCGGGACATTTGAGACCGCTGAGGA 

GGCAGCAAGGGCTTATGACCGAGCAGCCTTTAACCTTAGGGGTCATCTCGCTATACTCAA 

CTTCCCTAATGAGTATTATCCACGTATGGACGACTACTCGCTTCGCCCTCCTTATGCTTC 

TTCITCTTCGTCGTCGTCATCGGGTT 

AGAAGTTTTCGAGTTTGAGTATTTGGACGATAAGGTTCTTGAAGAACTTCTTGATTCAGA 
AGAAAGGAAGAGATAATCACGATTAGTTTTGTTTTGATATTTTATGTGGCACTGTTGTGG 
CTACCTACGTGCATTATGTGCATGTATAGGTCGCTTGATTAGTACTTTATAACATGCATG 
CGACGACCATAAATTGTAAGAGAAGACGTACTTTGCGTTTTCATGAAATATGAATGTTA^ 
ATGGTTTGAGTACAAAAAAAAAAAAAAAAAAAAAAA 

>G1792 Amino Acid Sequence (domain in aa coordinates: 17-85) 
ME SSNRSSNNQSQDDKQARFRGVRRRPWGKFAAEI RDPSRNGARIjWIiGTFETAEEAARAY 
DRAAFNLRGHLAlIiNFPNEYYPRMDDYSLRPPYASSSSSSSSGSTSTNVSRQNQREVFEF 
EYLDDKVLEELIiDSEERKR* 
>G1865 (48.-899) 

AAGAAGAGGACATGAAGCACAGAGATTCTGCAGACTGCAGGTGACCAATGGACACTTTAT 

CAATAAAAACATAC^TACTACTCTCTTACACTTTGAATTTTCCAATACAA 

TTAATCTCTCTTTCTTCTTCATCTCTCTTTCTCTTTCTCTCTTCATGGCTACAAGGATTC 

CATTCACAGAATCACAATGGGAAGAACTTGAAAACCAAGCTCTTGTGTTCAAGTACTTAG 

CTGCAAATATGCCTGTTCCACCTCATCTTCTCTTCCTCATCAAAAGACCCTTTCTCTTCT 

CTTCTTCTTCTTCTTC^TCTTCTTCTTC^^GCTTCTTCTCTCCCACTCTTTCTCCACACT 

TTGGGTGGAATGTGTATGAGATGGGAATGGGAAGAT^AGATAGATGCAGAGCCAGGAAGAT 

GTAGAAGAACTGATGGCAAGAAATGGAGATGCTCTAAAGAAGCTTACCCTGACTCTAAGT 

ACTGTGAGAGACATATGCATAGAGGCAAGAACCGTTCTTCCTCAAGAAAGCCTCCTCCTA 

CTCAATTCACTCCAAATCTCTTTCTCGACTCTTCTTCCAGAAGAAGAAGAAGTGGATACA 

TGGATGATTTCTTCTCCATAGAACCTTCCGGGTCAATCAAAAGCTGCTCTGGCTCAGCAA 
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TGGAAGATAATGATGATGGCTCATGTAGAGGCATCAACAACGAGGAGAAGCAGCCGGATC 
GAC^TTGCTTCATCCTTGGTACTGACTTGAGGACACGTGAGAGGCCATTGATGTTAGAGG 
AGAAGCTGAAACAAAGAGATCATGATAATGAAGAAGAGCAAGGAAGCAAGAGGTTTTATA 
GGTTTCTTGATGAATGGCCTTCTTCTAAATCTTCT 

CATCTTTTGTTCTTATAACCTTGTATTTCTTGTTAAGATGGTAATGCAAATT 

>G1865 Amino Acid Sequence (domain in AA coordinates: 124-149) 

MDTLSIKn^YLLLSYTFNFPIQIPIFNLSFFFISIjSIjSLFMATRIPFTESQWEEIjENQALV 

FKYIiAANMPVPPHLLFLIKRPFLFSSSSSSSSSSSFFSPTLSPHFGWNVYEMGMGRKIDA 

EPGRCRRTDGKKWRCSKEAYPDSKYCERHMHRGKSTC^ 

RSGYMDDFFSIEPSGSIKSCSGSAMEDNDDGSCRGINNEEKQPDRHCFILGTDLRTRERP 

LMLEEKLKQRDHDNEEEQGSKRFYRFLDEWPSSKSSVSTSLFI* 

>G1886 (43.. 909) 

AGGAAACATAAGTAATCX3TTGCTTCGATCCTTTGTTACATGGATGGATCCTGAACAGGAA 

ATCTCAAACGAGACTTTGGAAACTATATTGGTAAGTTCAACAAAAGGAAGCAATAATAAC 

AATAAGAAAATGGAAGAAGAAATGAAGAAGAAAGTATCAAGAGGAGAATTAGGAGGTGAA 

GCTCAAAATTGTCCAAGATGTGAATCTCCAAACACAAAGTTTTGTTAOT 

AGTCTCTCACAACCTCGTTACTTCTGCAAATCTTGTCGGAGATATTGGACTAAAGGCGGT 

ACTCTTCGTAACGTTCCCGTCGGTGGTGGTTGCCGTCGAAAC^y^ACGATCCTCTTCCTCA 

GCTTTCTCCAAGAACAACAACAATAAGTCTAT^ 

CCTTTAATTACGGGAATGCC^CCATCATCTTTTGGTTATGATCACTCCATTO 

CTCGCTTTCGCTACTCTCCAAAAGCATCATTTATCCTCTCAAGCTACTACGCCTTCTTTT 

GGGTTTGGAGGTGATCTTTCTATTTATGGAAACTCAACGAATGATGTAGGGATCTTCGGA 

GGGCAAAACGGTACTTATAACAATAGTTTGTGTTATGGGTTTATGTCCGGAAATGGTAAT 

AATAATCAAAATGAAATCAAGATGGCTTCTACATTGGGGATGTCTTTGGAAGGAAACGAG 

AGAAAGCAAGAGAATGTGAACAATAACAATAATAACTGA.GAGAATCCTAGCAAGGTGTTC 

TGGGGGTTTCCATGGCAGATGACCGGAGATTCCGCCGGAGTTGTACCGGAGATTGATCCC 

GGAAGGGAAAGCTGGAATGGGATGGTTTCATCTTGGAATAATGGTTTACTCAACACTCCT 

TTGGTCTAGCAGATCATTAA 

>G1886 Amino Acid Sequence (domain in aa coordinates: 17-59) 
MDPEQE ISNETLET ILVS STKGSNNNNK3CMEEEMKKKVSRGEIjGGEAQNCPRCES PNTKF 
CYYNNYSLS QPRYFCKSCRRYWTKGGTLRNVPVGGGCRRNKRS S S SAFSKNUNNKS INFH 
TDPLQNPL ITGMPPS S FGYDHS IDLNLAFATIiQKHHLS SQATTPS FGFGGDLS I YGNSTN 
DVGIFGGQNGTYNNSLC YGFMSGNGNNNQNE I KMASTLGMS LEGNERKQENYNNRMNNSE 
NPSKVFWGFPWQMTGDSAGVVPEIDPGRESVmGlWSSWiraGLLNTPLV* 
>G1933 (33. .1418) 

AATTGAGATTAAAGTAATTTATCTTTCAGAAAATGGCGGTTGAAGACGATGTATCTTTGA 
TAAGAACGACGACGTTAGTGGCACCAACAAGACCCACGATTACAGTTCCTCATAGACCTC 
CGGCGATCGA7UVCGGCGGCGTATTTCTTTGGCGGTGGAGATGGGCTTAGTCTAAGCCCAG 
GGCCACTTTClTTTGTCTCTTCTTTGTTTGTTGATAACTTCCCTGACGTCTTGACGCCGG 
ATAACCAACGGACGACGTCGTTTACTCAGCTTCTTAACGGAACTATGTCGGTGTCTCCTG 
GTGGCGGAGGACGTTCAACGGCGGGGATGTTCGCCGGAGGAGGTCCGATGTTTACAATCC 
CTTCTGGTTTCAGCCCTTCTAGTCTTCTCACCTCGCCCATGTTCTTTCCCCCGCAGTCGT 
CAGCTCATACCGGCTTTATTCAACCACGGC^ 

ACACGTTTCCTCACCATATGCCACCATCGACATCCGTCGCCGTCCATGGTCGTCAATCTT 
TAGACGTTTCACAAGTAGATCAAAGAGCTCGAAACCATTATAATAATCCGGGGAATAACA 
ATAATAACCGGTCGTATAACGTTGTGAACGTTGATAAACCGGCGGATGACGGTTATAACT 
GGAGGAAGTACGGACA7\AAGCCTATCAAAGGGTGTGAATATCCAAGGAGTTATTACAAAT 
GTACACATGTTAAC^GTCCGGTGAAGAAGAAAGTCGAACGGTCATCGGATGGACAGATCA 
CTCAGATCATTTACAAAGGTCAACATGATCACGAGAGGCCTCAGAATCGCCGTGGCGGTG 
GAGGCAGAGATTCCAGTGAGGTTGGTGGTGCAGGGCAT^TGATGGAATCTAGTGATGATA 
GTGGTTATCGTAAGGATCATGATGATGATGATGATGATGATGAAGATGATGAAGATCTTC 
CGGCTTCAAAGATAAGAAGAATAGACGGTGTGTCGACGACTCACCGGACGGTGACCGAGC 
CTAAGATTATCGTTCAGACAAAAAGTGAAGTCGATCTTCTCGACGATGGCTATAGGTGGC 
GTAAGTACGGACAAAAAGTTGTCAAAGGAAATCCCCATCCAAGGAGCTATTATAAATGTA 
CAACGCCAAATTGTACGGTCCGTAAACATGTAGAGAGAGCTTCCACGGATGCTAAGGCTG 
TGATTACAACTTACGAAGGTAAACACAATCACGATGTCCCTGCCGCTAGAAACGGTACCG 
CGGCAGCAACCGCAGCTGCGGTGGGGCCGTCTGACCACCATCGTATGAGATCAATGTCGG 
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GGAACAATATGCAACAACATATGAGTTTCGGTAAC^^ 

TTCTTTTGAGGTTGAAAGAAGAGAAAATCACAATTTGACTTTTAAGAACCA 
AGATTGATATT 

>G1933 Amino Acid Sequence (conserved domain in AA coordinates : 2 05-263 , 344-404) 

MAVEDDVSLIRTTTLVAPTRPTITVPHRPPAIETAAYFFGGGDGLSLSPGPLSFVSSLFV 

DNFPDVXiTPDNQRTTSFTQIiLNGTMSVSPGGGGRSTAGMFAGGGPMFTIPSGFSPSSIiLT 

SPMFFPPQSSAHTGFIQPRQQSQPQPQRPDTFPHHMPPSTSVAVHGRQSIiDVSQVDQRAR 

NHYlsINPGNNNNlS^ 

VERS SDGQITQI I YKGQHDHERPQNRRGGGGRDSTEVGGAGQMMES SDDSGYRKDHDDDD 
DDDEDDEDLPASKIRRIDGVSTTHRTVTEPKIIVQTKSEVDLLDDGYRWRK^GQKVVKGN 
PHPRS YYKCTTPNCTVRKHVERAS TD AKAVI TTYEGK3TNHD VP AARNGTAAAT AAAVG P S 
DHHIiJ^SMSGNl^QQHMSFGNNNNTGQSPVLIjRLKEEKITI * 
>G2059 (58.. 1089) 

TTAAGAACAGGCTTCATTCTCTGGACAAACACT 

GAAGATCAGTTTCCTAAAATAGAAACTAGCTTCATGCACGACAAGCTCTTGTCTTCTGGA 
ATCTACGGGTTCTTGAGTTCTTCGACGCCGCCACAACTTCTCGGTGTTCCAATATTTTTG 
GAAGGTATGAAATCTCCTCTTCTTCCTGCTTCTTCGACTCCGAGCTACTTTGTGTCGCCT 
CATGATCATGAGCTCACATCTTCTATTCATCCATCTCCGGTAGCTTCTGTTCCTTGGAA^ 
TTTCTAGAATCTTTTCCTCAGTCTCAACATCCTGATCATCATCCTTCTAAACCTCCAAAC 
CTTACTTTGTTCCTTAAAGAACCAAAGCTACTAGAACTTTCTCAATCCGAAAGCAACATG 
AGCCCTTACCATAAATACATCCCAAACTCCTTTTATCAATCAG 

TGGGTAGAGATCAATAAAACTCTAACCAACTATCCCTCGAAAGGTTTTGGAAACTATTGG 
CTAAGTACCACCAAGACTCAACCCATGAAGTCAAAAACAAGAAAGGTTGTTCAGACGACG 
ACCCCAACAAAACTGTATAGAGGAGTGAGACAAAGACACTGGGGCAAATGGGTCGCAGAG 
ATTAGGCTTCCAAGGAACAGAACCCGTGTTTGGCTCGGCACTTTTGAAACCGCTGAGCAA 
GCAGCAATGGCTTACGATACAGCAGCTTATATCCTTCGTGGCGAATTCGCACACCTCAAC 
TTTCCTGATCTTAAACACCAGCTCAAGTCCGGTTCTTTGCGATGCATGATCGCCTGACT^ 
TTGGAGTCCAAGATTCAACAGATCTCATCTTCCCAAGTAAGTAACTCTCCTTCTCCTCCT 
CCTCCAAAAGTGGGAACACCGGAGCAAAAGAATC^TCACATGAAGATGGAGTC^GGAGAA 
GACGTGATGATGAAGAAACAGAAAAGCCATAAGGAAGTGATGGAAGGAGATGGTGTACAA 
TTGAGTAGGATGCCTTCTTTGGATATGGATCTCATTTGGGATGCTCTCTCATTTCCTCAT 
TCTTCTTGACTTGAAATTAATATTTGTCAAACTTAT^ 

TATCAAAAGTTTCCACCAAAGAAAGAAATTCATATTATGATGCCAAGATTGGTTTGCATT 
TGGGGTTGAACACATTGTAATTCTTCTTACGACCACATAATCAAGTGGTTCTCCTTTTTT 
TGTCTGCTAA 

>G2059 Amino Acid Sequence (conserved domain in AA coordinates : 184-254) 

MEDQFPKIETSFMHDKLLSSGIYGFLSSSTPPQLLGVPIFIiEGMKSPIiLPASSTPSYFVS 

PHDHELTSSIHPSPVASVPWNFLESFPQSQHPDHHPSKPP* 

MSPYHKY-IPNSFYQSDQNRNEWVFlINKTIiTNYPSKGFGim^ST 

TTPTKLYRGWQRHWGKWVAEIRLPRNRTRV^ 

NFPDLKHQLKSGSLRCMIASIjIjESKIQQISSSQVSNSPSPPPPKVGTPEQKlJniHMKMESG 

EDVMMKXQKSHKEWIEGDGVQLSRMPSLDMDLIWDALSFPHSS * 

>G2105 (42.. 1487) 

CTCTCTGACTTGAACTCTTCT^ 

ATCCACAGTACGGTATAGAACAACCATCTTC 

ACCTCGTTTCAGCGCCGGACCAGCACCATCGTCTTCATTTCACCGACCATGAGATAAGTT 

TATTGCCACGTGGAATACAAGGGCTTACGGTGGCTGGAAACAACAGTAACACTATTACAA 

CGATCCAGAGTGGTGGCTGTGTTGGTGGGTTTAGTGGCTTTACGGACGGCGGAGGAACAG 

GGAGGTGGCCGAGGCAAGAGACGTTGATGTTGTTGGAGGTCAGATCTCGTCTTGATCACA 

AGTTCAAAGAAGCTAATCAAAAGGGTCCTCTCTGGGATGAAGTTTCTAGGATTATGTCGG 

AGGAACATGGATACACTAGGAGTGGCAAGAAGTGTAGAGAGAAGTTCGAGAATCTCTACA 

AGTACTATAAAAAAACAAAAGAAGGCAAATCCGGTCGGCGACAAGATGGTAAAAACTATA 

GATTTTTCCGGCAGCTTGAAGCGATATACGGCGAATCCAAAGACTCGGTTTCTTGCTATA 

ACAAC^CGCAGTTCATAATGACCAATGCTCTTCATAGTAATTTCCGCGCTTCT 

ATAACATCGTC CCTCATCATCAGAATCCCTTGATG AC CAATACCAATACTCAAAGTCAAA 

GCCTTAGCATTTCTAAC1AATTTCAACTCCTCCTCCGATTTGGATCTAACTTCTTCCTCTG 

AAGGAAACGAAACTACTAAAAGAGAGGGGATGCATTGGAAGGAAAAGATCAAGGAATTCA 
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TTGGTGTTC^TATGGAGAGGTTGATAGAGAAGCAAGATTTTTGGCTTGAGAA 

AGATTGTGGAAGACAAAGAACATCAAAGGATGCTGAGAGAAGAGGAATGGAGAAGGATTG 

AAGCGGAAAGGATCGATAAGGAACGTTCGTTTTGGACAAAAGAGAGGGAGAGGATTGAAG 

CTCGGGATGTTGCGGTGATTAATGCCTTGCAGTACTTGACGGGAAGGGCATTGATAAGGC 

CGGATTCTTCGTCTCCTACAGAGAGGATTAATGGGAATGGAAGCGATAAAATGATGGCTG 

ATAATGAATTTGCTGATGAAGGAAATAAGGGCAAGATGGATAAAAAACAAATGAATAAGA 

AAAGGAAGGAGAAATGGTCAAGCCACGGAGGGAATCATCCAAGAACCAAAGAGAATATGA 

TGATATACAACAATCAAGT^AACTAAGATTAATGATTTTTGTCGAGATGATGACCAATGCC 

ATCATGAAGGTTACTCACCTTCAAACTCCAAGAACGCAGGAACTCCGAGCTGCAGCAATG 

CCATGGCAGCTAGTACAAAGTGCTTTCCATTGCTTGAAGGAGAAGGAGATCAGAACTTGT 

GGGAGGGTTATGGTTTGAAGCAAAGGAAAGAAAATAATCATCAGTAAGC^ 

TTCTCAAAATGAAGAATAAGAGAACTTAGAAACGAT 

>G2105 Amino Acid Sequence (domain in AA coordinates: 100-153) 
MEDHQiraPQYGIEQPSSQFSSDLFGFl^VSAPDQHHRIiHFTDHEISLLPRGIQGLTVAGN 
NSNTITTIQSGGCVGGFSGFTDGGGTGRWPRQETl^ 
VSRIMSEEHGYTRSGKKCREKFEl^YKYYKKTKE^ 

DSVSCYMNTQFIMTNALHSNFRASNIHNI VP^^ SSDIj 
DLTS S S EGNETTKREGMHWKEKIKEF IGVHMERL I EKQDFWLEKLMKI VEDKEHQRMLRE 
EEWRRIEAERIDKERSFWTKERERIEARDVAVINALQYIiTGRALIRPDSSSPTERINGNG 
SDKMMADl^FADEGNKGKMDKKQMNKKRKEKWS SHGGNHPRTKENMMIYNNQETKINDFC 
RDDDQCHHEGYS PSNS KNAGTPS CSNAMAASTKCFPLIiEGEGDQNLWEGy GLKQRKENNH 
Q* 

>G2117 (49.. 465) 

ATACTTGTCAACAAAAATTTTCTTAAAGAACGCATA 

GTCTATAACCTTCCAAGTCAAAACCCTAATCCACAGTCTTTATTCCAAATCTT^ 
CGAGTACCACTTTCAAACTTGCCTGCCACGTCAGACGACTCTAGCCGGACTGCAGAAGAT 
AATGAGAGGAAGCGGAGAAGGAAGGTATCGAACCGCGAGTCAGCTCGGAGATCGCGTATG 
CGGAAACAGCGTCACATGGAAGAACTGTGGTCCATGCTTGT^ 

AAATCTCTAGTCGATGAGCTAAGCCAAGCC^GGGAATGTTACGAGAAGGTTATAGAAGAG 
AACATGAAACTTCGAGAGGAAAACTCCAAGTCGAGGAAGATGATTGGTGAGATCGGGCTT 
AATAGGTTTCTTAGCGTAGAGGCCGATCAGATCTGGACCTTCTAATCGTCTCGTAAGCTT 
GTTGGTTTTTTGTTGTTTATTTAAAG 

>G2117 Amino Acid Sequence (conserved domain in AA coordinates : 46-106) 

MAGSvYNLPSQNPNPQSLFQIFVTDRVTLSNLP 

RSRMRKQRHMEELWSMIiVQLINKNKSL^ 

EIGLNRFIjSVEADQIWTF * 

>G2124 (87.. 923) 

GAAC^GO^AAACCCTAGATTTCCTGTTCAAGCTC^GACCGTACAAAACTTTGGAAC^ 

TATATAAAGATCTCGAGAATAGC^TTATGAATATCGTCTCTTGGAAAGATGCAAACGACG 

AAGTTGCAGGCGGCGCTACGACAAGACGTGAAAGAGAAGTAAAAGAGGATCAAGAAGAAA 

CCGAAGTCAGAGCC^CCAGTGGC^AAACCGTAATTAAAAAGCAGCCTACATCGATCrCOT 

CTTCTTCTTCTTCGTGGATGAAATCCAAGGATCCGAGGATTGTTAGGGTTTCACGCGCCT 

TTGGAGGCAAAGACCGTCACAGCAAAGTGTGTACGTTACGTGGACTACGTGACAGACGCG 

TGAGATTATCAGTCCCAACGGCTATTCAGCTCTACGATCTTCAAGAACGGCTCGGTGTTG 

ACCAGCCTAGCAAAGCCGTTGACTGGTTGCTTGATGCAGCTAAAGAGGAGATCGACGAGC 

TACCTCCGTTACCTATCTCGCCGGAAAATTTCAGC^TCTTCAACCATCATCAGTCCTTCT 

TGAATCTTGGTCAACGGCCCGGTCAAGATCCGACCCAACTCGGGTTTAAAATCAATGGAT 

GTGTACAAAAGTCXACTACTACTAGCCGCGAAGAAAACGATAGAGAGAAAGGAGAAAACG 

ATGTCGTTTACACAAACAATC^T(^TGTTGGGTCTTATGGAACTTATCACAACCTGGAAC 

ATCATCATCATCATCACCAACATTTGAGTTT^ 

ATAGTCTTGTCCCATTTCC^TCACAAATTTTGGTATGTCCAATGACGACATCACCAACAA 
CTAO^CTATAO^TCTTTGTTTCCATC^TCATCGTCAGCTGGTTCAGGGACTATGGA^ 
CATTAGATCCGAGGCAAATGTAGCAACAATGGTGGTAGAGACATTGATAATCGGATGTCG 
TCGGTCCAATTCAACCGAACTAATAGCACTACAACGGCT 

TCGGAGCGTTGTACAAGTAGAGGAAGTGATCACCATATGTGAAGTTAGATTATTGAAACG 
ATATAATTGTTGTTTGATGTGTTCAGAAATAAGGGGACAC 

>G2124 Amino Acid Sequence (domain in AA coordinates: 75-132) 
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MNIVSWKDANDEVAGGATTRREREVKEDQEETEVRATSGKTVIKKQPTSISSSSSSWMKS 
KDPRIWVSRAFGGKI>RHSKVCTLRGLRDRRVRLSVPT^ 

LLDAAKEEIDELPPLP I SPENFS I FNHHQSFIiNLGQRPGQDPTQLGFKINGCVQKSTTTS 
REENDREKGENDVVYTNNHHVGSYGTYHNIiEHHHHHHQH^ 
ILVCPMTTSPTTTTIQSLFPSSSSAGSGTMETIiDPRQM* 
>G2140 (148. .1254) 

ACTCTCTTAACTTTCGTTTCTTCTCCTACCTTCTTTTACCAACCTTTCCTTTCTCTTACA 

CACATATATATATACATATATAGAGAGAGAGAAGAGGACAAAGAGTTGAAAGATGAAGAC 

TCTCATGTCTTCATAGAAACAAGTGATATGTGCGCTAAGAAAGAAGAAGAAGAAGAAGAA 

GAAGAAGACAGTTCTGAAGCC^TGAACAAC^TAC^AAATTACCAAAATGACCTCT^ 

CACCAACTCATCTCTCATCATCACCATCATCATCATGATCCTTCTCAA 

GGAGCATCCGGTAACGTTGGATCTGGTTTCACTATCTTCTCTCAAGATTCCGTCTCTCCA 

ATATGGTCTCTACCTCCACCTACCTCGATCCAACCACCATTTGATCAGTTTCCTCCTCCT 

TCTTCTTCTCCAGCATCTTTCTACGGAAGTOT 

GGATTACAGTTTGGGTACGAGGGTTTTGGTGGAGCCACGTCAGCAGCACATCATCATCAT 

GAACAACTTCGGATCTTGTCGGAAGCTTTAGGTCCGGTAGTACAAGCCGGGTCCGGTCCT 

TTTGGGTTACAAGCTGAGTTAGGGAAGATGACAGCACAAGAGATCATGGACGCT 

TTGGCTGCTTCAAAGAGTCATAGTGAAGCTGAGAGAAGAAGAAGAGAGAGAATCAATAAT 

CATCTCGCTAAGCTCCGTAGCATATTACCCAACACCACCAAAACGGATAAAGCGTCGTTA 

CTAGCTGAAGTGATCCAACATGTGAAAGAGTTGAAGAGAGAGACTTCAGTGATCTCAGAG 

ACAAATCTTGTCCCAACGGAAAGCGATGAGTTAACGGTAGCTTTCACGGAGGAGGAAGAA 

ACCGGAGATGGCAGATTTGTAATTAAAGCGTCGCTTTGCTGTGAAGACAGGTCGGATCTC 

TTGCCTGAC^TGATTAAAACATTGAAAGCTATGCGTCTCAAAACGCTCAAGGCGGAGATA 

ACCACCGTTGGGGGACGAGTCAAGAACGTTTTGTTTGTTACCGGAGAAGAGAGCTCCGGT 

GAGGAAGTGGAGGAAGAGTACTGTATAGGGACGATTGAGGAAGCTTTGAAAGCGGTGATG 

GAGAAGAGCAATGTAGAGGAATCATCTTCTTCTGGAAATGCTAAGAGACAGAGAATGAGT 

AGTCACAACACTATCACTATCGTCGAACAACAACAACAATATAATCAGAGGTAATCAATT 

TTTTACTTAAATCGCTTTTTTTTCTTACTTTCGGTGTATCTACTACGTGTGTTGTTTGCT 

GGTTATGGAAATGAATGTTGTACGTCACGTTATACTATAGATATATGTGTGTTTGTGTGT 

ATGTATAACGGAAGTATTTGTATCCGTTGTGGTCTTGGACTTTTGGTTTGGTTCTAAGAT 

ACTTATTTTTAAAAACTTGTATCGTTGAGTTGGTTTTCTAGATATGCTTAATGGGAGTAT 

GTGACGAAAAAAAA 

>G2140 Amino Acid Sequence (domain in AA coordinates : 167-242 ) 
MGAKKEEEEEEEEDSSEAMNNIQNYQNDLFFHQLISHHH^ 

FTIFSQDSVSPIWSLPPPTSIQPPFDQFPPPSSSPASFYGSFFNRSRAHHQGLQFGYEGF 
GGATSAAHHHHEQLRILSEALGPWQAGSGPFGLQAELGKMT^^ 
AERRRRERINNHLAKLRSILPNTTKTDKASLLAEVIQ 
ELTVAFTEEEETGDGRFVIKASLCCEDRSDLL^ 

VIjFVTGEESSGEKTOEEYCIGTIEEALKAVMEKSNVEESSSSGNAKRQRMSSHNTITIVK 

QQQQYNQR* 

>G2144 (102 . .1241) 

ATTAGGGTTTTGTTGTCGTGAGATTTGATTACAC^AATTGCTGAATTTGGTTTCGATTAT 
TGGTGTTATTGTTTTCGAAGATTTCCAGTGAGTTTCCGTTTATGGATCTGACTGGAGGAT 
TTGGAGCTAGATCCGGCGGTGTTGGACCGTGCCGGGAACCAATAGGCCTTGAATCGCTAC 
ATCTCGGTGACGAATTTCGGCAACTAGTGACGACTTTACCTCCCGAGAACCCCGGCGGTT 
CGTTCACGGCTTTGCTTGAGCTTCCACCTACACAAGCAGTGGAGCTTCTCCATTTCACT^ 
ATTCTTCGTCTTCTCAAC^^GCGGCAGTGACAGGGATCGGTGGAGAGATTCCTCCGCCGC 
TTCACTCTTTCGGTGGGACATTGGCTTTTCCTTCTAACTCAGTTCTCATGGAGCGAGCAG 
CTCGTTTCTCGGTGATTGCCACTGAGCAACAAAACGGAAATATCTCCGGGGAGACTCCGA 
CGAGCTCTGTACCTTGCAATrCAAGTGCTAATCTCGACAGAGTCAAGACGGAGCCTGCTG 
AGACCGATTCATCTCAGCGGTTGATTTCTGATTCAGCGATTGAGAATCAAATCCCTTGCC 
CTAACGAGAACAATCGAAATGGGAAGAGGAAAGATTTCGAAAAGAAGGGTAAAAGCTCGA 
CGAAGAAGAAGAAAAGCTCTGAAGAGAACGAGAAGCTGCCATATGTTCACGTTAGAGCTC 
GTCGTGGTCAAGC^y\CCGATAGCCATAGCTTAGCAGAACGAGCAAGAAGAGAGAAGATAA 
ATGCACGAATGAAGCTGTTACAGGAACTGGTCCCAGGCTGTGATAAGATTCAAGGTACCG 
CGCTGGTGCTGGATGAAATCATTAACCATGTCCAGTCATTACAACGTCAAGTGGAGATGC 
TATCAATGAGACTTGCTGCGGTAAACCCCAGAATCGACTTCAATCTCGACACCATATTGG 
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CTTCAGAAAACGGTTCTTTAATGGATGGGAGCTTCAATGCCGCACCAATGCAGCTTGCTT 
GGCCTCAGCAAGCGATTGAGACCGAACAGTCCTTTCATCACCGG 

CAACACAACAATGGCCTTTTGACGGCTTGAACCAGCCGGTATGGGGAAGAGAAGAGGATC 
AAGCTCATGGCAATGATAACAGCAATTTGATGGCAGTTTCTGAAAATGTAATGGTGGCTT 
CTGCTAATTTGCACCCT^AATCAGGTCAAAATGGAGCTGTAAGTTGGGAAAACGGTAGAGA 
TCATGAATGTGTATATACATCGTATAAGCTCGTTTCTCTCTATATAAATATAATCATAAA 
TATAGATATCTGTTAAGAAGGTATCAGTCATTTGATTCAGAGAGACAACACTGGTATGAT 
TGTTTCTTATTCTTGTACCAGATTTCGACAATGTAGAATTTAGTAGGATATGATCATTTT 
GATCTCGTTATATATA 

>G2144 Amino Acid Sequence (domain in AA coordinates : 2 03-283) 

MDLTGGFGARSGGVGPCREPIGLESLHLGDEFRQLVTTLPPENPGGSFTAIiLEIiPPTQAV 

ELIiHFTDSSSSQQAAVTGIGGEIPPPLHSFGGTLAPPSNSVIjMERAARFSVIATEQQNGN 

ISGETPTSSVPSNSSANLDRVKTEPAETDSSQRLiI SDSAIENQIPCPNQNNRNGKRKDFE 

KKGKSSTKKNKSSEENEKLPYVHVRARRGQATDSHSI^ 

DKIQGTALVLDEIIiraVQSLQRQVE^SMRIiAAWPRIDFNIiD 

APMQLAWPQQAIETEQSFHHRQLQQPPTQQWPFDGLNQPWGREEDQAHGNDNSNLMAVS 
ENVMVAS ANLHPNQVKMEL * 
>G2431 (47.. 1057) 

CCCTTTCGTTTTTATTTAAATTTCTTGGGTCGTTTCTTAAATTTGTATGTGTTTATTAAT 
GGAGATCAAC^^TAATGCCAACAATACTAATACTACTAT^ 

GAGCCTTGTGTTGTCAACGGATGCTAAGCCAAGGTTGAAATGGACTTGTGATCTTCATCA 
C^AATTCATCGAAGCCGTTAATCAACTTGGAGGACCTAACAAAGCAACACCTAAGGGTTT 
GATGAAGGTTATGGAGATTCCTGGGCTTACCTTATACCATCTCAAGAGCCZATTTACAGAA 
ATATCGGTTAGGGAAGAGCATGAAGTTCGATGATAACAAGCTAGAAGTTTCCTCTGCATC 
AGAGAATCAAGAAGTTGAGAGTAAAAACGATTCAAGAGATCTCCGAGGCTGCAGTGTCAC 
CGAAGAAAAGAGCAATCGAGCTAAAGAAGGGCTAC^^ 

GATGGAAGTTCAGAAGAAACTTCATGAACAAATCGAAGTTCAGAGGCATTTGCAGGTGAA 

GATTGAGGC^CAAGGAAAGTATCTA(^GTCOTTTTTAATGAAAGCTCAACAAACTCTC^ 

TGGCTACTCATCTTC^AATCTCGGCM 

TTCAATGGTGAACAGAGGCTGTCCAAGCACTTCGTTCTCAGAGCTAACGCAAGTAGAAGA 

AGAAGAAGAAGGTTTCTTGTGGTACAAGAAACCAGAAAACAGAGGAATTAGTCAGCTGAG 

ATGTTCAGTAGAGAGCTCGTTGACATCTTCAGAGACCTCAGAGACAAAACTGGATACTGA 

CAATAACCTTAATAAATCGATTGAACTTCCGTTGATGGAGATCAACTCGGAAGTGATGAA 

GGGGAAGAAGAGAAGCATAAACGACGTCGTTTGCGTGGAGCAGCCTCTAATGAAGAGAGC 

TTTTGGAGTTGATGATGATGAGCATTTGAAGTTGAGTTTGAATACTTACAAGA 

GGAGGCGTGTACGAACATAGGACTAGGGTTTAATTAAAAAAAAAACATTTTACTAAAGTT 

ATATAAAAATGTTTTAAAAGAATCCA 

>G2431 Amino Acid Sequence {conserved domain in AA coordinates : 3 8-88) 

MCLLMEINNNANNTNTTIDNHKATO 

TPKGLMKVMEIPGLTLYHLKSHL^ 

GCSVTEENSNPAKEGLQITEALQMQMEVQKKIjHEQIEVQRHLQVKIEAQGKYLQSVLMKA 
QQTIiAGYSSSNLGMDFARTEIjSRIjASMVNRGCPSTSFSELTQVEEEEEGFLWYKKPENRG 
ISQLRCSVESSIjTSSETSETKLDTDNOTjNKSIELPIiMEINSEVMKGKKRS ihdwcveqp 

lmkrafgvdddehlklslntykkdmeactniglgfn* 

>G2465 (86.. 1150) 

c^tattctttctccattgagattaagcttctttctcgctgtcgtctctctatagatctt 
ggttcttagtcccttttgaataataatgatggtggagatggattacgctaagaaaatgca 
gaaatgtcatgaaxacgttgaagcacttgaagaagaacagaagaaaatccaagtctttca 
acgcgagcttcctttatgtttagagcttgtcactcaagcgatcgaagcttgtcggaagga 
gttatctggtacgacgacaactac^tcagaagagt^ 

tggtggtcctgtctttgaagagtttattcctatc^^gaaaattagttccttgtgtgaaga 
agtacaagaagaagaagaagaagatggtgaacatgaatcttctccagaacttgtgaataa 
taagaaatcagattggcttagatctgttcagctatggaatcattcaccggatctaaatcc 
aaaagaggagcgtgtagctaagaaagcgaaagtggtggaggtgaaaccaaaaagcggtgc 
gtttcagccgtttcaaaagcgcgttttggagactgatttgcaaccggcggtgaaagtagc 
tagttcgatgccagcgacgacgacgagttctacgacggaaacttgtggtggtaaaagtga 
tttgattaaagctggagatgaggaaagacggatagagcagcagcaatcgcagtcgcatac 
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GCATAGAAAACAAAGGCGGTGCTGGTCGCCGGAATTACACCGTCGATTCCTAAACGCGCT 
TCAGCAGCTTGGAGGATCTCATGTTGCTACACCAAAGCAAATCAGGGATCACATGAAGGT 
TGATGGATTAACAAACGACGAAGTTAAAAGCCATTTACAGAAATATAGACTTCACACAAG 
AAGGCCAGCAGCAACATCCGTGGCGGCACAAAGTACCGGGAATCAGCAACAACCACAATT 
TGTGGTGGTTGGAGGCATATGGGTACCATCGTCACAAGATTTTCCACCACCGTCCGATGT 
AGCCAACAAGGGTGGTGTATATGCTCCGGTTGCGGTGGCGCAATCTCCAAAACGTTCGTT 
GGAGAGAAGTTGCAACTCGCCGGCGGCATCTTCCTCTACAAATACAAATACTTCTACTCC 
TGTGTCATAATCTGATAGTCIATACTATAATCATCTCCTGATGTTGATTTTGGTGTAGGTT 
TGAAAATGTTTATGTGAATGTAA 

>G2465 Amino Acid Sequence (conserved domain in AA coordinates : 219-269) 
MMVEMDYAKKMQKCHEYV^^ 

SEQCSEQTTSVCGGPVTEEFIPIKKISSLCEEVQEEEEEDGEIIESSPELVNNKKSDWIjRS 
VQLVTOHSPDLNPKEERVAKXAKVVEVKPKSG 

SSTTETCGGKSDLIKAGDEERRIEQQQSQSHTHRKQRRCMSPELHRRFLNALQQLGGSHV 
ATPKQIRDHMKVDGIiTlTOEvTCSHLQK^ 

PSSQDFPPPSDVA^GGVYAPVAVTVQSPKRSLERSraSPAAS 
>G2583 (38.. 607) 

CAAATCAGAAAATATAGAGTTTGAAGGAAACTAAAAGATGGTACATTCGAGGAAGTTCCG 
AGGTGTCCGCCAGCGACAATGGGGTTCTTGGGTCTCTGAGATTCGCCATCCTCTATTGAA 
GAGAAGAGTGTGGCTTGGAACTTTCGAAACGGCAGAAGCGGCTGCAAGAGCATACGACCA 
AGCGGCTCTTCTAATGAACGGCCAAAACGCTAAGACCAATTTCCCTGTCGTAAAATCAGA 
GGAAGGCTCCGATCACGTTAAAGATGTTAACTCTCCGTTGATGTCACCAAAGTCATTATC 
TGAGCTTTTGAACGCTAAGCTAAGGAAGAGCrrGC^AAGACCTAACGCCTTCTTTGACGTG 
TCTCCGTCTTGATACTGAC^GTTCCCACATTGGAGTTTGGCAGAAACGGGCCGGGTCGAA 
AACAAGTCCGACTTGGGTCATGCGCCTCGAACTTGGGAACGTAGTCAACGAAAGTGCGGT 
TGACTTAGGGTTGACTACGATGAACAAACAAAACGTTGAGAAAGAAGAAGAAGAAGAAGA 
AGCTATTATTAGTGATGAGGATCAGTTAGCTATGGAGATGATCGAGGAGTTGCTGAATTG 
GAGTTGACTTTTGACTTTAACTTGTTGCAAGTCCAC^ 

>G2583 Amino Acid Sequence (domain in AA coordinates : 4-71) * . 

MVBSRKFRGTOQRQWGSWVSEIimPIiLKRRVWLGTFETAEAAARAYDQAAL 

OTPWKBEEGSDHVKDWSPLMSPKSIiSELIiNAKLR^ 

WQKElAGSKTSPTWVMRIiELGNVVNE S AVDLGLTTMNKQNVEKEEEEEEAI I SDEDQLAME 

MIEELLNWS* 

>G2724 (1..651) 

ATGGAAATAGAAATAAGGAGAGGTCCATGGACTGTGGAAGAAGACATGAAGCTCGTCAGT 
TACATTTCTCTTCACGGTGAAGGAAGATGGAACTCCCTCTCTCGTTCTGCTGGACTGAAT 
AGAACGGGGAAAAGTTGCAGATTGCGGTGGCTAAATTATCTCCGGCCGGATATCCGCCGT 
GGAGACATATCCCTTCAAGAACAATTTATGATCCTTGAACTCCATTCTCGTTGGGGAAAT 
CGGTGGTCAAAGATTGCTCAACATTTACCGGGAAGAACAGATAACGAGATAAAGAATTAT 
TGGAGAACACGTGTTCAAAAGCATGCAAAACTTCTAAAATGTGACGTGAACAGCAAGCAA 
TTCAAAGACACCATCAAACATCTCTGGATGCCTCGTCTCATCGAGAGAATCGCCGCCACT 
CAAAGTGTCCAATTTACCTCTAACCACTACTCGCCTGAGAACTCCAGCGTCGCCACCGCC 
ACGTCATCAACGTCGTCGTCTGAGGCTGTGAGATCGAGTTTCTACGGTGGTGATCAGGTG 
GAATTTGGAACGTTGGATCATATGACAAATGGTGGTTATTGGTTCAACGGCGGAGATACG 
TTTGAAACTTTGTGTAGTTTTGACGAGCTCAACAAGTGGCTCATACAGTAG 

>G2724 Amino Acid Sequence (conserved domain in AA coordinates : 7-113) 

ME I E IRRGPWTVEEDMKLVS Y I SLHGEGRWNSLSRSAGLNRTGKS CRLRWLNYLRPDIRR 

GDISLQEQFIILEIJISRWGNRWSKIAQHLPGRTDI^IKNYWRTRVQKHAKLIjKCT 

FKDTIKHLWMPRlilERIAATQSVQFTSNHYSPENSSVATATSSTSSSEAVRSSFYGGDQV 

EFGTLDHMTNGGYWFNGGDTFETLCSFDELNKWIiIQ* 

>G377 (1..396) 

atgggtctctcgcattttccaacagcgtcagaaggagtactaccacttctggtgatgaac 
acggttgtttcaatcactctgttgaagaacatggtgaggtctgtttttcaaattgttgca 
tccgagactgaatcttccatggagatagacgacgagcctgaagatgattttgttactaga 
agaatctcgataacacagttcaagtctctatgtgagaacatagaagaggaagaagaagag 
aaaggtgtggagtgttgtgtgtgcctttgtgggtttaaagaggaagaggaagtgagtgag 
ttggtttcttgcaagcatttcttccacagagcttgtctagacaactggtttggtaataac 
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cacaccacatgccctctttgcaggtccattctctag 

>G377 Amino Acid Sequence (domain in AA coordinates : 85-128) 
MGIjSHFPTASEGVLPLLV1^T\A7SITIjLKNMVRSVFQIVASETESSMEIDDEPEDDFVTR 
RI SITQFKSIjCENIEEEEEEKGVECCVCIjCGFKEEEEVSELVSCKHFFHRACLDNWFGITN 

HTTCPLCRSIL* 
>G428 (97.. 1032) 

TTACTTTTGTGTTTCTTCATATTCTTCAGAAGCAAGCACAAGGCTAGGGATCGAAGAAGC 

GGCGATCACTGATCGTATCTCACTACGATCACATTAATGGATAGAATGTGTGGTTTCCGC 

TCGACGGAAGACTATTCGGAGAAAGCGACGTTGATGATGCCGTCCGATTATCAGTCTTTG 

ATTTGTTCAACCACCGGAGACAATCAAAGACTGTTTGGATCCGACGAACTCGCTACCGCT 

TTGTCCTCGGAGTTGCTTCCGCGTATTCGAAAAGCTGAGGATAATTTCTCTCTTAGTGTC 

ATCAAATCCAAAATCGCTTCTCATCCTTTGTATCCTCGCTTACTCCAAACCTACATCGAT 

TGCCAAAAGGTGGGAGCGCCTATGGAAATAGCGTGTATATTGGAAGAGATTCAGCGAGAG 

AACCATGTGTACAAGAGAGATGTTGCTCCATTATCTTGCTTTGGAGCTGATCCTGAGCTT 

GATGAATTCATGGAAACCTACTGTGATATATTGGTTAAATACAAAACCGATCTTGCGAGG 

CCGTTCGACGAGGCTAC^CTTT(^TAAACAAGATTGAAATGCAGCTTCAGAACTTGTGC 

ACTGGTCCAGCGTCTGCTACAGCTCTTTCAGATGATGGTGCGGTTTCATCTGACGAGGAA 

CTGAGAGAAGATGATGACATAGCAGCGGATGACAGCCAACAAAGAAGCAATGACCGCGAT 

CTGAAGGACCAGCTACTACGCAAATTTGGTAGCCATATCAGTTCATTGAAACTCGAGTT 

TCTAAAAAG AAGAAGAAAGGGAAGCTAC CAAG AG AAGCAAGACAAG CGTTGCTCGATTGG 

TGGAATGTTCATAATAAATGGCCTTACCCTACTGAAGGCGACAAAATAGCrCTGGCTGAA 

GAAACAGGTTTGGATCAAAAACAAATCAACAATTGGTTTATAAACCAAAGGAAACGCCAT 

TGGAAGCCTTCGGAGAACATGCCGTTTGATATGA 

ACCGAGGAATGAAAAGAGAGACATGGGATTGTGCATTGTATAATTTTTACACTGTTTTCC 
CAAGAAAAGAAAAGAGTAAAAAGCTTTTGGTAAATG 

CCAGTTAGCCAAAACGGTCAAGGGCGTGGCGTAACGAGACATTGTATTGGAAATAGTGGC 

AATATTATGTCACTAATCTTCCAATGGTCCAAAATGATAGATTTCTTATTTGTAT^GAAC 

CTTACTTAGATAGCTGATGTGTCAACTAAATAATTTATTTTCATCCTTATACTACTTGTA 

TCAATGTCTCTAATTGATCAATTGTTGCTTGCTATTCAAAAAAAAAAAAAA 

>G428 Amino Acid Sequence (domain in AA coordinates: 229-292) 

MDRMCGFRSTEDYSEKATIiMMPSDYQSLICSTTGDNQRLFGSDEIiATAliSSELIjPRIRKA 

EDNFSIiSVIKSKIASHPLYPRIiLQTYIDCQKVGAPMEIACILEEIQRENHVYKRDVAPLS 

CFGADPELDEFMETYCTILVKYKTDiLARPro 

GAVSSDEELREDDDIAADDSQQRSlTORDLKDQLLRKFGSHISSLKIiEFSKKKKKGKLPRE 

ARQALLDWWNVHNKWPYPTEGDKIALAEETC^ 

DDSNETFFTEE* 

>G447 (241.. 3501) 

CTTTTTAAGAGCTTAAAAATTTGCTTTGAAGCTTCAAATATTCTTATGAACTAAAAAGAA 
GAAAAAAGCTTTTGTTTCTTTTTCCTTAGCAGCAGAATGATTTTTGTTTCCAAAATTATT 
ACTATTTAGTTTCTCTCGTGCTCTTCTCTTGAGCAAATAGAGATTCGTTAATTTTGCTGA 
AGAAGAAGAACTCTGTTTCTTCCCTGCACCAAACCAATTTTTTCGTTCTTTCTATAAACC 
ATGAAAGCTCCATCAAATGGATTTCTTCGAAGTTCCAACGAAGGAGAGAAGAAGCCAATC 
AATTCTCAACTATGGCACGCTTGTGCAGGGCCTTTAGTTTCATTACCTCCTGTGGGAAGT 
CTTGTGGTTTACTTCCCTCAAGGACACAGCGAGGAAGTTGCAGCATCGATGCAGAAGCAA 
ACAGATTTTATACCAAATTACCCAAATCTTCCTTCTAAGCTGATTTGCTTGCTTCACAGT 
GTTACATTACATGCTGATACCGAAACAGATGAAGTCTATGCACAAATGACTCTTCAACCT 
GTGAATAAGTATGATAGAGAAGCATTGCTAGCTTCTGATATGGGCTTGAAGCTAAACAGA 
CAACCTACTGAGTTOTTTTGCAAGACTCTTACTGC^ 

TTCTCTGTACCGCGTCGTGCAGCTGAGAAAATATTCCCTCCTCTTGATTTCTCGATGCAA 
CCGCCTGCGCAAGAGATTGTAGCTAAAGATTTACATGATACTACATGGACTTTCAGACAT 
ATCTATCGAGGCCAACCAAAAAGACACTTGCTTACC^ 

ACAAAGAGACTATTTGCGGGTGATTCAGTTTTGTTTGTAAGAGATGAGAAATCACAGCTG 
ATGTTGGGTATAAGACGTGCAAATAGACAAACTCCGACTCTTTCCTCATCGGTCATATCC 
AGCGACAGTATGCACATTGGGATACTTGCAGCTGCAGCTCATGCTAATGCCAATAGTAGC 
CCTTTTACCATCTTCTTCAATCCAAGGGCAAGTCCTTCAGAGTTTGTAGTTCCTTTAGCC 
AAATACAACAAAGCCTTATACGCTCAAGTATCTCTAGGAATGAGATTCCGGATGATGTTT 
GAGACTGAGGATTGTGGGGTTCGTAGATATATGGGTACAGTCACAGGTATTAGTGATCTT 



51 



WO 03/013227 PCT/US02/25805 

52/286 



GACCCTGTAAGATGGAAAGGCTCACAATGGCGTAATCTTCAGGTAGGATGGGATGAATCA 

ACAGCTGGAGATAGGCCAAGCCGAGTATCCATATGGGAAATCGAACCCGTCATAACTCCT 

TTTTACATATGTCCTCCTCCATTTTTCAGACCTAAGTACCCGAGGCAACCCGGGATGCCA 

GATGATGAGTTAGACATGGAAAATGCTTTCAAAAGAGCAATGCCTTGGATGGGAGAAGAC 

TTTGGGATGAAGGACGCAC^GAGTTCGATGTTCCCTGGTTTAAGTCTAGTTCAATGGATG 

AGTATGCAGCAAAACAATCCATTGTCAGGTTCTGCTACTCCTCAGCTCCCGTCCGCGCTC 

TCATCTTTTAACCTACCAAACAATTTTGCTTCCAACGACCCTTCCAAGCTGTTGAACTTC 

CAATCCCCAAACCTCTCTTCCGCAAATTCCCAATTCAACAAACCGAACACGGTTAACCAT 

ATCAGCCAACAGATGCAAGCACAACCAGCCATGGTGAAATCTCAACAACAACAACAACAA 

CAACAACAACAAC AC CAACAC CAACAAC AACAACTGCAACAACAACAACAACTAC AGATG 

TCACAGC^CAGGTGCAGCAACAAGGGATTTATAACAATGGTACGATTGCTGTTGCTAAC 

CAAGTCTCTTGTCAAAGTCCAAACCAACCTACTGGATTCTCTCAGTCTCAGCTTCAGCAG 

CAGTCAATGCTCCCTACTGGTGCTAAAATGACACACCAGAACATAAATTCTATGGGGAAT 

AAAGGCTTGTCTCAAATGACATCGTTTGCGCAAGAAATGCAGTTTCAGCAG 

ATGCATAACAGT AGCCAGTTATTAAGAAAC CAGCAAGAACAGTC CTCTCTC CATTCATTA 

CAACAAAATCTGTCCCAAAATCCTCAGCAACTCCAAATGCAACAACAATC^ 

AGTCCTTCACAACAGCTTCAGTTGCAGCTACTGCAGAAGCTACAGCAGC^ 

CAGTCGATTCCTCCAGTAAGCTCATCCTTACAGCCACAATTATCAGCGT^ 

CAAAGCCATCAATTGCAACAACTTCTGTCGTCT 

AATAACAGCTTCCCAGCTTCAACTTTCATGCAGCCTCCACAGATT 

CAG<^GGACAGATGAGTAACAAAAATCTTGT^^ 

ACAGATGGAGAAGCTCCTTCTTGTTCAACCTCACCTTCCGCCAATAACACGGGACATGAT 
AATGTTTCACCGACAAATTTCCTGAGC^^ 

TCTGCATCTGATTCAGTCTTTGAGCGCGCAAGCAATCCGGTCCAAGAGCTTTATACAAAA 

ACTGAGAGCCGGATCAGTCAAGGCATGATGAATATGAAGAGTGCTGGTGAACATTTCAGA 

TTTAAAAGCGCGGTAACAGATCAAATCGATGTATCCACAGCGGGAACGACGTACTGTCCT 

GATGTTGTTGGCCCTGTACAGCAGCAACAAACTTTCCCACTACCATCATTTGGTTT 

GGAGACTGCCAATCTCATCATCCAAGAAACAACTTAGCTTTCCCTGGTAATCTCGAAGCC 

GTAACTTCTGATCCACTCTATTCTCAAAAGGACTTTCAAAACTTGGTTCCCAACTATGGC 

AACACACCAAGAGACATTGAG ACGG AGCTGTC CAGTGCTG CAATCAGTTCT CAGTCATTT 

GGTATTCCCAGCATTCCCTTTAAGCCCGGATGTTCAAATGAGGTTGGCGGCATCAATGAT 

TCAGGAATCATGAATGGTGGAGGACTGTGGCCCAATCAGACTCAACGAATGCGAACATAT 

ACAAAGGTTCAAAAACGAGGGTCAGTAGGTAGATCAATAGATGTTACCCGTTATAGCGGC 

TATGATGAAOTTAGGCATGACTTAGCGAGAATGTTTGGCATCGAAGGACAGCTCGAAGAT 

CCGCTAACCTCTGATTGGAAACTCGTCTACACCGATCACGAAAACGATATTTTACTAGTT 

GGTGATGATCCTTGGGAAGAGTTTGTGAACTGCGTGCAGAACATAAAGATACTATCATCA 

GTAGAAGTTCAGCAAATGAGCTTAGACGGAGATCTTGCAGCTATCCCAACCACAAACCAA 

GCCTGCAGCGAAACAGACAGCGGAAATGCTTGGAAAGTACACTATGAAGACACTTCTGCT 

GC^GCTTCTTTCAACAGATAGAAATAAAAAGATGCAAATATACCAAGTCAACTTACATTA 

TCATTCGAGGCCATCGCAAAGTACATGTTTTTTTTTGTGTGTATGTACTGCAAAC^ 

ACTGAGAAGAAGAAGATACTGCACGGTATATAAACATTTTTATAGGACAGTGATTTGATT 

TTTCATTCTAACTTGATGTTGTTGTACTTTCTTGTTTCCATATTTGTATAACAAGTATAA 

TGCTTGACAAGTCTATGAGGAGCATATCTTATACAGAGATACTAAGATGTAATGTTAATG 

TAACTAAACAATTACCTTCATTAATCATGAATCCTTTGGTCGTTTAAAA 

>G447 Amino Acid Sequence (conserved domain in AA coordinates : 22-35 6) 

MKAPSNGFLPSSNEGEKKPINSQLWHACAGPLVSLPPVGSIjVVbfFPQGHSEQVAASMQKQ 

TDF I PNYPNLPS KL I CLLHS WLHADTETDE VYAQMTLQ P WKYDREALLASDMGLKLNR 

QPTEFFCKTLTASD^THGGFSWRRAAEKIFPPIiDFSMQPPAQEIVAKDLHDTTWTFRH 

I YRGQPKRHLLTTGWS VFVS TKRLFAGDS VLFVRDEKS QLMLG IRRANRQTPTLSS S VI S 

SDSMHIGILAAAAHANANSSPFTIFFNPRASPSEFVVPLAKYNKALYAQ 

ETEDCG VRRYMGTVTGI SDLDPVRWKGSQWRNLQVGWDESTAGDRPSRVS I WE IEPVITP 

FYICPPPFFRPKYPRQPGMPDDELDMENAFKRAMPWMGEDFGMKDAQSSMFPGLSLVQWM 

SMQQNNPIiSGSATPQLPS ALS SFNliPNNFASNDPSKtiliNFQS PNLS S ANSQFNKPNTVNH 

I SQQMQAQPAMVKS QQQQQQQQQQHQHQQQQLQQQQQLQMSQQQVQQQG I YNNGT IAVAN 

QVSCQSPNQPTGFSQSQLQQQSMLPTGAKMTHQNINSMGNKGLSQMTSFAQEMQFQQQIjE 

MHNSSQLLimQQEQSSLHSLQQNLSQNPQQLQMQQQSSKPSPSQQLQLQLLQKLQQQQQQ 

QSIPPVSSSLQPQLSALQQTQSHQLQQLLSSQNQQPIiAHGNNSFPASTFMQPPQIQVSPQ 
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QQGQMSNKNLVAAGRSHSGHTDGEAPSCSTSPSAN^^ 

SASDSVPERASNPVQELYTKTESRISQGMMNMKSAGEHFRFKSAVTDQIDVSTAGTTYCP 
DWGPVQQQQTFPLPSFGFDGDCQSHHPRNNIAFPGNLEAVTSDPLYSQKDFQNLVPNYG 
NTPRDIETELSSAAISSQSFGIPSIPFKPGCSNEVGGINDSGIMNGGGLWPNQTQRMRTY 
TKVQKRGSVGRSIDVTRYSGYDEI*RHDIiARMFGIEGQLEDPLTSDWKLVYTDHEM)ILLV 
GDDPWEEFVNCVQNIKILSSVEVQQMSLDGDLAAIPTTNQACSETDSGNAWKVHYEDTSA 
AASFNR* 

>G464 (41. .760) 

CTCTGCTGGTATGATTGGAGTCTAGGGTTTTGTTATTGACATGCGTGGTGTGTCAGAATT 

GGAGGTGGGGAAGAGTAATCTTCCGGCGGAGAGTGAGCTGGAATTGGGATTAGGGCTCAG 

CCTCGGTGGTGGCGCGTGGAAAGAGCGTGGGAGGATTCTTACTGCTAAGGATTTTCCTTC 

CGTTGGGTCTAAACGCTCTGCTGAATCTTCCTCTC^CCAAGGAGCTTCTCCrCCTCGTTC 

AAGTCAAGTGGTAGGATGGCCACCAATTGGGTTACACAGGATGAACAGTTTGGTTAATAA 

CCAAGCTATGAAGGCAGCAAGAGCGGAAGAAGGAGACGGGGAGAAGAAAGTTGTGAAGAA 

TGATGAGCTCAAAGATGTGTCAATGAAGGTGAATCCGAAAGTTCAGGGCTTAGGGTTTGT 

TAAGGTGAATATGGATGGAGTTGGTATAGGCAGAAAAGTGGATATGAGAGCTCATTCGTC 

TTACGAAAACTTGGCTCAGACGCTTGAGGAAATGTTCTTTGGAATGACAGGTACTACTTG 

TCGAGAAAAGGTTAAACCTTTAAGGCTTTTAGATGGATCATCAGACTTTGTACTCACTTA 

TGAAGATAAGGAAGGGGATTGGATGCTTGTTGGAGATGTTCCATGGAGAATGTTTATCAA 

CTCGGTGAAAAGGCTTCGGATCATGGGAACCTCAGAAGCTAGTGGACTAGCTCCAAGACG 

TCAAGAGCAGAAGGATAGACAAAGAAACAACCCreTTTAGCTTCCCTTCCA^^ 

TTGTTTATGTATTGTTTGAGGTTTGCAATTTACTCGATACTTTTTGAAGAAAGTATTTTG 

GAGAATATGGATAAAAGCATGCAGAAGCTTAGATATGATTTGAATCCGGTTTTCGGATAT 

GGTTTTGCTTAGGTCATTCAATTCGTAGTTTTCCAGTTTGTTTCTTCTTTGGCTGTGTAC 

CAATTATCTATGTTCTGTGAGAGAAAGCTCTT 

>G464 Amino Acid Sequence (domain in AA coordinates: 20-28, 71-82, 126-142, 187- 
224) 

MRGVSELEVGKSNLPAESELEIjGIjGIjSLGGGAWKERGRIIiTAKDFPSVGSKRSAESSSHQ 
GASPPRSSQWGWPPIGLHRMNSLV^QAMKAARAE^ 
VQGLGFVKVNl^GVGIGRKOTMRAHSSYENI*AQTIiEEMFFGMTGTTCR^ 
SDFVIjTYEDKEGDWMI»VGDVPWRMF ins VKRLRIMGTSE ASGLAPRRQEQKDRQRNNPV * 
>G557 (192.. 698) 

CAGAGATCTGACGGCGGTAGCAGAGTAATCTATTCCTTCCCAAAATGTCTCGCAATTAGA 
TTCTTTCCAAGTTCTTCTGTAAATCCCAAGTCCCGCTCTTTTCCTCTTTATCCTTTTCAC 
CAGCTTCGCTACTAAGACAACAAATCTTTCCCTCTCTCTCTCGCCTGATCGATCTTCAAA 
GAGTAAGAAAAATGCAGGAACAAGCGACTAGCTCTTTAGCTGGAAGCTCTTTACCATCAA 
GCAGCGAGAGGTCATCAAGCTCTGCTCCACATTTGGAGATCAAAGAAGGAATTGAAAGCG 
ATGAGGAGATACGGCGAGTGCCGGAGTTTGGAGGAGAAGCTGTCGGAAAAGAAACTTCCG 
GTAGAGAATCTGGATCGGCGACCGGTCAGGAGCGGACACAGGCGACTGTCGGAGAAAGTC 
AAAGGAAGCGAGGGAGGACACCGGCGGAGAAAGAGAACAAGCGGCTGAAGAGGTTGTTGA 
GGAACAGAGTTTCAGCTCAGCAAGCAAGAGAGAGGAAAAAGGCTTACTTGAGCGAGTTGG 
AAAACAGAGTGAAAGACTTGGAGAACAA7VAACTCTGAACTTGAAGAGCGACTCTCTACTC 
TTCAGAACGAGAACCAGATGCTTAGACATATTCTGAAGAACACAACAGGAAACAAGAGAG 
GAGGTGGTGGTGGTTCTAATGCTGATGCAAGCCTTTGATCTCCTTCTTCTTCTTGTGTTA 
TATTTTTGTGGATAAAATTTACAGAGAATTGTATCAATAATTATCATGTTAAAATTATAT 
GGGATGTGAGAGCTAATATTGCAATTGTAGACCAAGTTGTCTTAAAAAAAAAAAAAAAAA 
AA 

>G557 Amino Acid Sequence (domain in AA coordinates: 90-150) 

MQEQATSSIiAASSLPSSSERSSSSAPHLEIKEGIESDEEIRRVPEFGGEAVGKETSGRES 

GSATGQERTQATVGESQRIQIGRTPAEKENKRLKRLLRMIVSAQQARERKKAYLSELEI^V 

KDLENKNSELEERLSTLQNENQMIiRHILKNTTGNKRGGGGGSNADASL* 

>G577 (44.. 2155) 

AAAAACAGACTGAGAGAGAGAGAGAGAGAGTGTGTTGTTGGCCATGGGATGCACGGCCTC 
CAAGCTCGACAGTGAGGATGCTGTCCGTCGCTGCAAGGAGCGGCGCCGTCTTATGAAGGA 
CGCCGTCTACGCTCGTCACCATCTCGCCGCCGCTCACTCTGACTACTGCCGCTCCCTTCG 
TCTCACTGGCTCTGCCCTCTCCTCCTTCGCCGCCGGCGAGCCCCTCTCCGTCTCCGAGAA 
TACTCCCGCTGTTTTTCTCCGCCCTTCCTCCAGTCAGGACGCGCCACGTGTCCCTTCTTC 
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CCATTCCCCAGAACCCCCTCCtCCGCCCATCCGCAGCAAGCCTAAGCCTACTAGGCCTAG 
GAGGCTTCCACACATTCTCTCCGACTCCTCTCCTTCTTCCTCTCCTGCCACCAGTTTCTA 
TCCC^CTGCTCACCAGAACTCTACTTACTCTCGCTCTCCATCTCZAAGCTTCCTCTGTCTG 
GAACTGGGAGAATTTCTACCCTCCCTCTCCCCCCGACTCCGAGTACTTCGAACGCAAAGC 
TCGCCAGAACCACAAGCACCGTCCTCCTTCCGACTACGACGCCGAAACTGAAAGATCCGA 
CCACGATTACTGCCACTCACGGAGAGATGCCGCCGAGGAAGTTCACTGCAGCGAGTGGGG 
CGACGACCACGACCGTTTCACTGCCACCTCTTCGTCCGACGGAGATGGGGAGGTCGAAAC 
TGACGTTTCCAGATCCGGTATTGAAGAAGAGCCTGTGAAA<^^ 

TGGCAAAGAGCACTCTGACCATGTTACCACTTCTTCCGACTGCTACAAGACCAAATTGGT 
GGTAAGGCACAAGAATTTGAAGGAGATCCTTGACGCCGTTCAAGACTACTTCGACAAGGC 
TGCCTCCGCTGGGGACCAGGTCTCCGCCATGCTTGAGATCGGCCGGGCTGAGCTCGACCG 
CAGCTTCAGCAAGCTGAGGA&GACGGTGTATCATT 

CGCAAGCTGGACCTCAAAACCCCCATTGGCAGTCAAATACAAGCTCGATGCATCTACCCT 
GAATGATGAACAAGGCGGCCTCAAGAGCCTCTGC^CCACTCTAGACCGACTCCTCGCTTG 
GGAAAAGAAGCTTTATGAGGATGTCAAGGCAAGAGAAGGAGTTAAGATTGAGCACGAGAA 
GAAGCTGTCTGCGCTGCAGAGTCAGGAGTATAAGGGAGGTGATGAATCCAAGCTAGACAA 
GAGTAAAACTTCCATAACCAGACTGGAATCACTCAT 

AACCACGTCTAATGCCATTCTCCGCCTCCGGGACACTGACCTTGTCCCTCAGCTTGTTGA 
ACTCTGCCACGGATTAATGTACATGTGGAAGTCAATGC^CGAGTATCACGAAATCCAGAA 
CAACATCGTGCAACAAGTCCGTGGCCTGATCAACCM 

AGAGGTACACCGGCAGGTGACGCGGGACCTAGAGTCAGCTGTGTCCTTGTGGCATTCGAG 
CTTCTGTCGCATCATTAAATTCGAGAGGGAGTTCATATGCTCTCTCCACGCATGGTTCAA 
' GCTGAGCCTGGTTCCCCTGAGCAACGGAGACCCAAAGAAACAGCGGCCAGACTCATTTGC 
CTTGTGCGAGGAGTGGAAGCAGAGCCTGGAACGGGTGCCTGACACAGTGGCGTCAGAAGC 
GATAAAGAGCTTTGTAAACGTGGTACATGTGATATCAATAAAGCAGGCGGAAGAGGTGAA 
GATGAAGAAACGCACGGAGAGTGCAGGAAAGGAGCTGGAGAAGAAAGCATCCTCACTGAG 
GAGCATAGAGAGGAAGTACTACCAGGCATACTCGACGGTTGGGATAGGCCCTGGACCGGA 
GGTGTTGGACTCACGGGACCCGCTATCTGAGAAGAAATGTGAGCTGGCGGCATGTCAGAG 
GCAGGTGGAGGATGAGGTAATGAGGCACGTGAAGGCTGTGGAGGTGACACGAGCTATGAC 
TCTCAACAATCTACAAACCGGCCTGCCCAATGTATTCCAGGCCTTGACCAGCTTCTCATC 
TCTCTTCACTGAATCTCTCCAGACTGTCTGTTCTCGTTCCTACTCCATCAACTGATTATG 
TCCAAGTTTCTCATTTATTTTTAAGCTCTC^ 

TGATTAAATTGAGTCTTGTGGTTTTGTGAGGACTGAGAATCTTTCTCATTTAAAAAAAAA 
AAAAAAAAAA 

>G577 Amino Acid Sequence (domain in AA coordinates: TBD) 
MGCTASKLDSEDAVRRCKERRRIdynro^ 

LSVSENTPAVFLRPSSSQDAPRVPSSHSPEPPPPPIRSKPKPTRPRRIiPHIIiSDSSPSSS 
PATSFYPTAHQNSTYSRSPSQASSVVn)TWENFYPPSPPDSEYFERKARQNHKHRPPSDYDA 
ETERSDHDYCHSRRDAAEEVHCSEWGDDHDRFTATSSSDGDGEVETHVSRSGIEEEPVKQ 
PHQDPNGKEHSDHVTTSSDCYKTKLVVRHKN^^ 

RAELDRSFSKLRKTVYHSSSVFSNLSASWTSKPPLAVKYKLDASTI^ 
DRLLAWEKKLYEDVKAREGVKIEHEKKIjSALQSQEYKGGDESBGjDKTKTS ITRL.QSLIIV 
S S EAVIjTTSNAI LRLRDTDLVPQLVEIjC^GLM ymwksmhe yhe iqnnivqqvrgl inqte 
rgestsevhrqvtrdlesavslwhssfcriikfqreficslhawfklslvplsngdpkkq 

RPDSFAIjCEEWKQSLERVPDTVASEAIKSFVIWVHVISIKQAEEVKMKKRTESAGKELEK 
KASSLRSIERKYYQAYSTVGIGPGPEVIjDSRDPLSEKKCELAACQRQV^ 
VTRAMTIiNNLQTGLPNVFQALTSFSSIjFTESIjQTVCSRSYSIN* 
>G674 (1..786>- 

ATGGTGTTTAAATCAGAATU^ATCAAACCGGGAAATGAAATCAAAGGAGAAGCAAAGGAAG 
GGATTATGGTC^CCCGAGGAAGATGAGAAGCTTAGGAGTCATGTCCTCAAATATGGCCAT 
GGATGCTGGAGTACTATTCCTCTTCAAGCTGGATTGCAGAGGAATGGGAAGAGTTGTAGA 
TTAAGGTGGGTTAATTATTTAAGACCTGGACTTAA 

GAAACTATACTTCTTTCACTTGATTCCATGTTGGGTAACAAATGGTCTCAGATAT 
TTCTTACCAGGAAGAACCGACAACGAGATCAAAAACTATTGGCATTCTAATCTAAAGAAG 

GGTGTAACTTTGAAACAACATGAAACCACAAAAAAACATCAAACACCTTT 

TCACTTGAGGCCTTGCAGAGTTCAACTGAAAGATCTTCTTCATCTATCAATGTCGGAGAA 

ACGTCTAATGCTCAAACCTCAAGCTTTTCGCCAAATCTCGTGTTCTCGGAATGGTTAGAT 
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CATAGTTTGCTTATGGATCAGTCACCTCAAAAGTCTAGCTATGTTCZAAAATCTTGTTTTA 
CCGGAAGAGAGAGGATTCATTGGACCATGTGGCCCTCGTTATTTGGGAAACGACTCTTTG 

TTCTGTACTTCATTTTCAGACAACTTTTTGTTCGATGGTCTCATCAACGAGCTACGACCA 
ATGTAA 

>G674 Amino Acid Sequence (domain in AA coordinates: 20-120) 

MVFKSEKSNREMKSKEKQRKGLWSPEEDEKI^ 

IJttWimiRPGLKKSLFTKQE 

GVTLKQHETTKKHQTPIilTNSLEALQSSTERSSSSINVGETSNAQTSSPSPNLVFSEWLD 
HSIjIjMDQS PQKS S YVQNIiVLPEERGF IGPCGPRYIiGNDSIjPDFVPNSEFLLDDE IS SEIE 
FCTSFSDNFLFDGLINELRPM* 
>G736 (1..513) 

ATGGCGACTCAAGATTCTCAAGGGATTAAACTCTTTGGCAAAACTATTGCATTTAACACT 
CGAACAATAAAAAATGAAGAAGAGACACACCCGCCGGAGCAAGAAGCCACAATAGCCGTT 
AGATCATCATCATCATCGGATCTGACGGCCGAGAAGCGTCCGGATAAGATCATAGCATGT 
CGAAGATGCAAGAGCATGGAGACAAAGTTCTGTTACTTC^ 

CCTCGACACTTTTGTAAAGGCTGCCACCGTTACTGGACCGCCGGTGGTGCACTCCGGAAC 
GTTCCCGTCG<3CGCCGGTCGTCGGAAGTCCAAACCACCTGGTCGTGTCGTGGTTGGTATG 
CTTGGAGATGGAAATGGTGTTCGCCAAGTCGAGCTTATAAATGGCTTGCTCGTTGAGGAG 
TGGCAG<^TGCCGCAGCCGCAGCTC^CGGTAGTTTCCGGC^TGATTTTCCCATGAAGCGG 
CTCCGGTGTTACTCCGACGGTCAATCGTGCTGA 

>G736 Amino Acid Sequence (domain in AA coordinates: 54-111) 

MATQDSQGIKLFGKTIAFlsPrRTIKNEEETHPPEQEATIAVRSSSSSDIiTAEKRPDKIIAC 

PRCKSMETKFCYFNNYNGNQPRHFCKGCHRYWTAGGALRN^ 

LGDGNGVRQVEIilNGLLVEEWQHAAAAAHGSFRHDFPMKRLRCYSDGQSC* 

>G903 (96. .1496) 

CCCGGGTCGACCCACGCGTCCGCTCTCTCTCTCTGAACTATACAA7VAACCTACTTTTAAT 

TTCTCTTCCAAGAAGTCAAGAACCC^GAAGAAGACATGACAAGTGAAG 

TCTCAAGTGGATC^GGTTTTGCTCAGCCACAGAGCTCATCAACCCTGGATCATO 

CTCTCATCAATCCTCCTCTTGTTAAGAAAAAGAGAAATCTCCCTGGAAATCCTGATCCGG 

AAGCTGAAGTGATAGCTTTATCCCCCACGACCTTGATGGCTACGAACCGGTTCCTATGTG 

AGGTATGTGGCAAAGGTTTCCAAAGAGACCAAAACTTACAGCTTCATCGGCGAGGACATA 

ATCTTCGATGGAAGTTGAAGCAGAGGACAAGCAAAGAAGTGAGAAAACGTGTCTACGTTT 

GCCCCGAGAAGACATGTGTCCACCATCACTCCTCTAGAGCTCTAGGCGATCTCACTGGAA 

TCAAAAAGCATTTTTGCCGGAAACACGGGGAGAAGAAGTGGACGTGCX3AGAAATGTGCTA 

AGAGATACGCAGTCCAATCTGATTGGAAAGCTCATTCCAAGACTTGTGGTACTAGAGAGT 

ACCGTTGCGATTGTGGCACCATTTTCTCAAGGCGAGACAGCTTTATCACT 

TCTGCGATGCCTTAGCGGAAGAAACCGCTAAGATAAACGCZAGTGTCTCATCTCAACGGTT 

TAGCCGCGGCnXBGAGCCCCAGGATCAGTTAATCTCAACTATCAATATC 

TCATCCGACCGCTTGAACCATTTGTAC 

AACATTTTCAGCCACCAACTTCTTCGTCGCTCTCTCTATGGATGGGACAAGATATCGCGC 

CGCCTCAACCGCAACCGGACTACGATTGGGTTTTTGGAAACGCTAAGGCT^GCGTCTGCT 

GCATTGATAATAATAATACTCACGATGAGCAGATTACGCAAAAC^ 

CCACTACCACTACTCTCTCTGCCCCTTCTTTATTCAGCAGCGACCAACCACAAAACGCAA 
ACGC^U^TTCAAACGTGAATATGTCCGCGACAGCTTTACTACAGAAAGCTGCTGAAATTG 
GCGCTACTTCTACAACAACCGCAGCGACCAATGACCCATCAACGTTTCTTCAAAGTTTCC 

TCGGGTCTAACAAGAACATTGGGTTAATGAGTCGTAGTCATGATCATCAAGAGATCGAGA 

ACGCTAGAAATGACGTTACGGTTGCGTCTGCCTTGGATGAATTACAGAATTACCCTTGGA 

AACGTAGAAGAGTTGATGGTGGAGGTGAAGTGGGTGGAGGAGGGCAAACTCGGGATTTCC 

TCGGGGTTGGTGTACAAACGTTGTGCCATCCATCGTCTATCAATGGATGGAlTTTGAAAGA 

GTTTAAAAATTTCGGGGTTAATGCATAAATTACGTAAAAGAAGAAGGAATCTTTTGTCAT 

TTCC^CCATTTTCTAAGATAACATATGTATATGGTAATGGAAGTTGTTTTCTTTTATTAA 

TTC^UITATTCTAAAACTTATGATATATGTATAATGAATGTGTTTATCTTCAAA 

>G903 Amino Acid Sequence (domain in AA coordinates: 68-92) 

MTSEVLQTI S SGSGFAQPQS S STLDHDESIilNPPLVKKKRNLPGNPD PEAEVIALSPTTL 

MATNRFL CEVCGKGFQRDQNLQLHRRGHNIjPWKLKQRTS KEVRKRVYVCPEKTCVHHHS S 
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RALGDLTGIKKHFCRKHGEKKHTCEKCAKRYAVQSDWKAHSKTCGTREYRCDCGTIFSRR 

DSFITHRAFCDAI1AEBTAKINAVSHLNGI1AAAGAPGSVNI1NYQ I PPIjQPFVPQP 

QTNPNHHHQHFQPPTSSSLSLWMGQDIAPPQPQPDYDWVFGNAKAASACIDNNNTHDEQ 

TQNAMASIiTTTTTLSAPSIjFSSDQPQNANANSNVNMSATALIjQKAAEIGA 

PSTFIiQSFPLKSTDQTTSYDSGEKFFALFGSNNNIGL 

DEIiQNYPWKRRRVDGGGEVGGGGQTRDFLGVGVQTIiCHPSSINGWI* 

>G917 (32.. 679) 

TTAGGGTTTTAGAAAGATAGATCGATTGAAGATGAGGAAAGGTAAGAGAGTGATAAAAAA 
GATAGAGGAGAAAATAAAGAGACAAGTGACATTCGCAAAGAGAAAGAAGAGTCTAATCAA 
GAAGGCATATGAACTCTCTGTTCTCTGCGATGTCCACCTTGGTCTCATCATCTTCTCTCA 
CTCC^CAGGCTCTACGATTTCTGCTCCAACTCTACCAGCATGGAGAATCTCATCATGAG 
ATACCAAAAGGAAAAAGAAGGTCAAACCACTGCAGAACACAGTTTCCACTCGGATCAGTG 
TTCAGATTGCGTGAAGACGAAGGAATCAATGATGAGAGAGATAGAGAATCTTAAGCTGAA 
TCTTCAATTGTACGACGGACATGGCTTGAATCTCTTGACCTACGACGAGCTCCTTTCTTT 
TGAGOTCCATCTCGAATCTTCTCTACAACATGCTCGAGCTCGCAAGTCTGAGTTCATGC^ 
TCAGCAGCAGCAGCAACAAACAGATCAAAAGCTTAAGGGAAAAG 

CTCTTGGGAGCAGCTGATGTGGCAAGCAGAGAGACAGATGATGACGTGTCAAAGACAAAA 
AGATCCTGCGCCGGCX3AATGAAGGAGGAGTTCCTTTTTTACGGTGGGGAACAACCCACCG 
ACGTTCTTCACCTCCTTAAGCTACCACAACCA^ 

CTATCTATAAAAAACAACTGATAGTAAAAAGTATTGACCCGGTTTGGTTCGGTTATGTTG 
ATACCAGACTAraAATTAACTTCGGTTAGACGTATTTACGACTTGATGCTATCTAGACCT 

TTTTGCCCTTCAAAAAAA 

>G917 Amino Acid Sequence (conserved domain in AA coordinates: 
MRKGKRVIKKIEEKIKRQVTFAKRKKSIjI 

STSMENIiIMRYQKEKEGQTTAEHSFHSDQCSDCVKTKESMMREIENIiKIj^^ 

i^tydellsfelhlesslqhar^ksefmhqqqqqqtdqklkgkekgqgssweqlmwqae 
rqi^cqrqkdpapaneggvpflrwgtthrrsspp* 

>G921 (116.. 1024) 
CC^^GATCGACTCTTACTTCGAATCT^ 

CACAC^TATACATCCACAAGAACCCATATCGAAGATTCATCCTACATATATTT^ 
TCAGTACTCATCCTCTTTGGTCGATACTTCATTAGATCTCACTATTGGCGTTACTCGTAT 
GCGAGTTGAAGAAGATCCACCGACAAGTGCTTTGGTGGAAGAATTAAACCGAGTTAGTGC 
TGAGAACAAGAAGCTCTCGGAGATGCTAACTTTGATGTGTGACAACTACAACGTCTTGAG 
• GAAGCAACTTATGGAATATGTTAACAAGAGCAACATAACCGAGAGGGATCAAATCAGCCC 
TCCCAAGAAACGCAAATCCCCGGCGAGAGAGGACGCATTCAGCTGCGCGGTTATTGGCGG 
AGTGTCGGAGAGTAGCTCAACGGATCAAGATGAGTATTTGTGTAAGAAGCAGAGAGAAGA 
GACTGTCGTGAAGGAGAAAGTCTCAAGGGTGTATTACAAGACCGAAGCTTCTGACACTAC 
CCTCGTTGTGAAAGATGGGTATCAATGGAGGAAATATGGACAGAAAGTGACTAGAGACAA 
TCCATCTCCAAGAGCTTACTTC^AATGTGCTTGTGCTCCAAGCTGTTCTGTCAAAAAGAA 
GGTTCAGAGAAGTGTGGAGGATCAGTCCGTGTTAGTTGCAACTTATGAGGGTGAACACAA 
CCATCGAATGCCATCGCAGATCGATTCAAAC^ 

TGGTTCAGCTTCAACACCCGTTGCAGGAAAGAGAAGAAGTAGCTTGACTGTGCCGGTGAC 

TACCGTAGATATGATTGAATCGAAGAAAGTGACGAGCCCAACGTCAAGAATCGATTTTCC 

CCAAGTTCAGAAACTTTTGGTGGAGCAAATGGCTTCTTCCTTAACCAAAGATCCTAACTT 

TACAGCAGCTTTAGCAGCAGCT'GTTACCGGAAAATTGTATCAACAGAATCATACCGAGAA 

ATAGTTTAGCTTCAAATTCCGTTAGAGTTTTTAGATTTGAATTTGTCATGAGTAAGAGAA 

AGAGAGTAGATTATAATCCNTTGTGATACTGAAAAAAAAAAAAAAAAAAA 

>G921 Amino Ac-id Sequence (domain in AA coordinates: 146-203) 

MDQYSSSIiVDTSLDLTIGVTRMRVEEDPPTSALV^ 

LRKQLMEYVNKSNITERDQISPPKKRKSPAREDAFSCAVIGGVSESSSTDQDEYLCKKQR 
EETVVKEKVSRVYYKTEASDTTLWK^ 

KKVQRSVCTQSVXVATYEGEHiraPMPSQIDSNNGLNRHISHGGSASTPVAAN^ 

VTTTOMIESKK^/TSPTSRIDFPQVQKLIiVEQMASS 

EK* 

>G922 (1..1449) 

ATGGTGGCTATGTTTC7VAGAAGATAATGGAACATCTTCTGTAGCTTCATCACCACTTCAA 
GTCTTCTCAACTATGTCACTCAACAGACCGACTCTCCTCGCTTCTTGATCTCCGTTTCAT 
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TGTCTCAAAGATCTCAAACCAGAGGAGCGTGGTCTCTACTTAATCCACCTCTTGCTAACT 
TGTGCCAACCACGTGGCTTCAGGTAGCCTCCAAAACGCTAACGCAGCGCTCGAGCAGCTC 
TCTCACCTCGCTTCTCCTGACGGCGACACGATGCAGCGAATCGCTGCTTACTTCACCGAA 
GCGCTTGCTAACAGAATCCTTAAGTCCTGGCCTGGTCTTTACAAGGCTCTTAACGCAACT 
CAGACAAGAAOTAACAATGTCTCTGAGGAGATTCATGTTAGAAGACTCTTCTTTGAGATG 
TTCCCGATACT'CAAAGTC^CTTACTTGCTCACTAATCGAGCTATACTCGAGGCTATGGAA 
GGAGAGAAGATGGTTCATGTGATTGATCTCGATGCTTCTGAGCCAGCTCAATGGCTTGCT 
TTGCTTGAAGCTTTTAACTCTAGGCCTGAAGGTCCACCT 

CATCACCAGAAGGAAGTGCTTGAACAAATGGCTCATAGACTC^TTGAGGAAGCAGAGA^ 
CTCGATATCCCGTTTCAGTTTAATCCCGTTGTGAGTAGGTTAGACTGTTTAAATGTAGAA 
GAGTTGCGGGTTAAAACAGGAGAGGCCTTAGCCGTTAGCTCGGTTCTTCAATTGCATACC 
TTCTTGGCCTCTGATGATGATCTGATGAGAAAGAACTGCGCTTTACGGTTTCAGAACAAC 
CCTAGTGGAGTTGACTTGCAGAGAGTTCTAATGATGAGCCATGGCTCTGCAGCTGAGGCA 
CGTGAGAATGATATGAGTAACAACAATGGGTATAGCCCTAGCGGTGACTCGGCCTCATCT 
TTGCCTTTACCAAGTTCAGGAAGGACTGATAGCTTCCTCAATGCTATTTGGGGTTTGTCT 
CCAAAGGTCATGGTGGTCACTGAGCAAGACTCAGACCAG^ 

AGGCTATTAGAATCACTTTACACCTACGCAGCATTGTTTGATTGCTTGGAAACAAAAGTT 
CCAAGAACGTCTCAAGATAGGATCAAAGTGGAGAAGATGCTCTTCGGGGAGGAGATCAAG 
AACATCATATCCTGCGAGGGATTTGAGAGAAGAGAAAGACACGAGAAGCTTGAGAAATGG 
AGCCAGAGGATCGATTTGG CTGGTTTTGGGAATGTTC CTCTTAGCTATTATGCGATGTTG 
CAGGCTAGGAGATTGCTTCAAGGGTGCGGTTTTGATGGGTATAGAATCAAGGAAGAGAGC 
GGGTGCGCAGTAATTTGCTGGCAAGATCGACCTCTATACTCGGTATCAGCTTGGAGATGC 
AGGAAGTGA 

>G922 Amino Acid Sequence (conserved domain in AA coordinates : 225-242) 

MVAMFQEDNGTS S VAS S PLQVFSTMSliNRPTLLAS S S PFHCLKDIiKPEERGLYL IHLLLT 

CANHVASGSLQNANAALEQLSHLASPDGDTMQRIAAYFTEAIiANRILKSWPGLYKA 

QTRTNITVSEEIHVRRLFFEMFPILKVSYTjLTNRAILEAM^ 

IiLQAFNSRPEGPPHIiRITGVHHQKEVl.EQMAHRIiIEEAEKI^I 

QLRVlCTGEAIiAVSSVIiQIiHTFLASDDDLMRKNCAIjR^ 

RENDMSNITNGYSPSGDSASSLPLPSSGRTDSFIjNAIWGLSPKVMVVTEQDSDHNGSTIjME 
RLLESLYTYAAIiFDCIiETKVPRTSQDRIKVEKl^FGEEIKNIISCEGFERRERHEKLEKW 
S QRIDLAGFGNVPLS YYAMIiQARRIiLQGCGFDGYRI KEE SGCAVI CWQDRPLYS VS AWRC 
RK* 

>G932 (206.. 1213) 

CCACGCGTCCGACCACTTGTACCTCTTTGTCTTAAGTACTCTTTAACCCTACAATTTCCT 
AAGCTCTC^AGCC^Cy^UUVACCAC^^ 

ATCAAAGTCCTTCTCTCTGCT(^TAC(^CAAACCGTTCCATTCTTCCCCTAATCACAAAG 
TGATATTTACATAGAGAAGATAGAGATGGGAAGACCACGATGCTGTGACAAGATTGGAGT 
GAAGAAAGGACCATGGACACCAGAGGAAGATATCATCTTGGTTTCTTACATCCAAGAACA 
TGGTCCTGGAAACTGGAGATCTGTGCCTACTCACACAGGTTTGAGGAGATGTAGCAAAAG 
CTGTAGATTGAGGTGGACTAATTATCITCGACCn^ 

GC^TGAAGAGAAGATGATTCTCCATCTTCAAGCTCTTTTGGGAAACAGGTGGG 

AGCATCATATCTTCCAGAAAGGAGAGACAATGATATAAAGAACTATTGGAACACTC^ 

GAAGAAAAAGCTCAAGAAGATGAATGATTCTTGTGATAGTACTATCAACAATGGCCTTGA 

TAATAAAGACTTCTCCATATCAAACAAAAACACTAC^ 

TAAAGGTCAATGGGAGAGAAGGCTTCAGACAGATATCAA 

TGATGCCTTGTCTATTGACAAACCACAAAACCCAACTAATTTTTCTATTCCCGATCTTGG 
TTATGGTCCATCAO^TTCTTCGTCCTCTACC^CCACCACC^CCACCACCACCACCACGAG 
AAACACTAATCCATACCC^TCTGGGGTCTATGCITCAAGTGCTGAGAACATTGCTCGTTT 
GCTTCAGAATTTTATGAAAGACACACCAAAGACCTCGGTGCCCTTGCCGGTTGCAGCCAC 
CGAGATGGCTATCACCACGGCAGCTTCGAGCCCTAGCACAACCGAAGGAGACGGAGAAGG 
GATTGACCATTCTTTGTTGAGCTTGAACTCG^^ 

AATAGACCATGACATTAATGGTCTAATTACACAAGGCTCTCTTTCTTTGTTCGAGAAATG 

GCTCTTTGATGAGCAAAGCCACGATATGATCATCAATAACATGTCACTAGAGGGTCAGGA 

AGTGTTGTTCTAGAAAGCATTAAAGTTTGACGATTTGCTTGAGGAACCACGAGGCTTA 

TATAAACAATTTGTATAATTAAGTACTCTTTAGTTTTGTTTTCAATCCTTATTATGATCA 

TATTGCAGTAATTAGGGATTTTAGTCTTTAGTAGTAACTCTTAAGTTTTAACAC^TTTTT 



57 



t 



WO 03/013227 PCT/US02/25805 

58/286 



CTCTATCTTTTTAGTAGTAACTCTTTATTTTTTCCTTAAATCTTTGTCGACGTGGAGATG 
ATATCTTCTATGTAGTAGAAACTCAAAAGTGTAC^TCATCTTTATTAATGTAACGTCTTT 
TTAAAAAAAAAAAAAAAAA 

>G932 Amino Acid Sequence (domain in AA coordinates: 12-118) 
MGRPPCC^KIGVKKGPWTPEEDIILVSYIQEHGPGNWRSVPTHTGL^ 
LRPGIKRGNFTEHEEKMILHIjQALLGNRWAAIASYLPERTDI^IKN^ 
DSCDSTINNGLDNKDFSISNKNTTSHQSSNSSKGQWERIUjQTDINM 

QNPTNFSIPDLGYGPSSSSSSTTTTTTTTTTRNTNPYPSGVYASSAENIARIjLQNFMKDT 
PKTSVPLPVAATEMAITTAASSPSTTEGDGEGIDHSLFSFNSIDEAEEKPKLiIDHDINGL 
ITQGSLSLFEKWLFDEQSHDMI INNMSLEGQEVLF* 
>G599 (152.. 1579) 

TCGACAGAAC^GCITCGTTGTCACTTGTCATTCTATAAATCGCATCCCCATTGACAACCT 
TTC^CTTCCATCAAAACTCTCTCTCTATATCTCTCTCTCTATATATCTCTCTCTATATCT 
CTCTCTCTCTTCACTCTCTCITTCTTT 

ACCCGACCCGGTTTACCGTCCACCGGAAAC^CCACTCGAACCGATGGAGTTTTTAGCTCG 

TTCATGGAGCGTCTCTGCTCTCGAAGTCTCCAAGGCTCTAACACGACCCAACCCTCAGAT 

TCTCCTCTCCAAAACCGAAGAAGAAGAAGAAGAAGAACCCATCTCCTCTGTCGTAGACGG 

CGACGGCGACACGGAAGACAC(XK3AC^TGTCACCGGAAACC 

AGAAACTTCTCAAATGGTCATGGATCGTATCTTGTCrC^ 

AACATCTGGTCGGCTATCTCACAGTAGTGGTCCACTTAATGGTTCTTTGACCGACAGTCC 

TCCTGTGTCTCCTCCCGAATCCGACGACATTAAGCAATTTTGCAGAGCGAACAAAAATTC 

ATTGAACAGTGTAAATTCTCAGTTCCGTTCAACGGCGGCAACTCCGGGACC 

TACAGCTACACAGTCCAAGACGGTGGGACGGTGGCTTAAGGACCGGAGAGAGAAAAAGAA 

AGAGGAGACTCGGGCTCATAACGCTCAGATTCACGCTGCTGTCTCTGTCGCCGGCGTTGC 

TGCAGCTGTTGCTGCTATTGCAGCAGCCACCGCTGCGTCTTCTAGCTGTGGTAAGGATGA 

GC^GATGGCTAAAACTGACATGGCCXSTTGCTTCTGCTGCGACCCTTGTGGCTGCTCAGTG 

TGTGGAAGCTGCTGAAGOTATGGGAGCTGAGAGAGAGTAT^ 

CGCCGTCAATGTTCGTTCTGCCGGAGATATTATGACTCTCACCGCCGGAGCAGCTACAGC 

TTTAAGAGGAGTGCAAACATTGAAGGCAAGGGCAATGAAGGAAGTGTGGAACATAGCATC 

AGTGATACCAATGGATAAAGGACTCACTTCTACAGGAGGAAGCAGCAATAATGTTAATGG 

TAGCAATGGAAGCTCAAGCAGTAGTCACAGTGGTGAACTTGTACAACAGGAGAATTTCTT 

GGGAACTTGTAGTAGAGAATGGCTCGCTAGAGGTTGTGAACTCCTCAAACGCACTCGCAA 

AGGTGATCTCCACTGGAAGATAGTATCTGTTTACATCAACAAAATGAATCAGGTTATGTT 

GAAGATGAAGAGCAGG CATGTTGGAGGAACCTTC AC CAAGAAGAAAAAGAACATTGTGCT 

TGATGTGATCAAGAATGTCCCGGCCTGGCCTGGACGACATTTGCTAGAGGGAGGAGATGA 

TCTAAGATACTTCGGTTTGAAGACGGTTATGCGAGGTGATGTTGAATTCGAGGTCAAGAG 

CC^AAGGGAATATGAAATGTGGACACAAGGTGTCTCAAGGCTTCTTGTTCTTGCTGCTGA 

GAGGAAGTTTAGGATGTGAATAAACGTTCAATGGCTGCTTGGTTTAAGTGTGAGTTTTTT 

TTTAACTTATGTGGTCAAATTTCATTAGTAGGGGTTCTTTTAAGGTAATGGTTTTTTGGG 

TTGGGTATAGGATAAAATGGACCTACCAGTCAAGGTGAGGAAGCATTTGGGTAAACAAAA 

CTTAGTGGGGGTGATCTGTAATATCTATGTTCTTAGTTTTTTTTTGGTTGTTGGTGGTCT 

TTTTGTATAAAAAAACAAAGTTGAAGTAATAGATATATAGTATGTTTTAATTTT7\AA 

>G599 Amino Acid Sequence (domain in AA coordinates: 187-219, 264-300) 

MEKLMVPTWRPDPVYRPPETPLEPMEFLARS WS VS ALE VS KALTPPNPQ ILLS KTEEEEE 

EEPISSWDGDGDTEDTGLVTGNPFSFACSETSQMVMDRILSHSQEVSPRTSGRLSHSSG 

PLNGSLTDSPPVS PPESDDI KQFCRANKNSLNSVNSQFRSTAATPGPITATATQSKTVGR 

WLKDRREKKKEETRAHNAQ IHAAVS VAGVAAAVAAI AAATAAS S S CGKDE QMAKTDMAVA 

SAATLVAAQCVEAAEVMGAERE YLAS WS SAVNVRS AGD IMTLTAGAATALRGVQTLKAR 

AMKEV^IASVIPMDKGLTSTGGSSNl^ 

GCELLKRTRKGDLHWKI VS VY INKMNQVMLKMKSRHVGGTFTKKKKNf I VLDVI KNVPAWP 
GRHLLEGGDDLRYFGLKTVMRGDVEFEVKSQREYEMWTQGVSRLLVLAAERKFRM* 
>G804 (114.. 1139) 

ATACTCCAAGAATTTATAGGTTATAAGTAAAAATTCAGTACAAGTTTGTTTGTTTGTTTA 
TTCCATTTTCTTGTGTGTTTTTTTC CC CATAATTTATAAATTTTATAAGCAATATGGAGT 
CCCACAAGAACAACCAGAGCAACAACAACACCACTC 
TGGGACCAATCTCCGGTTCAGTCTCATTAACCACC^ 

CCGTCACCGCCGCTAAAACACCCGCAAAACGACCGTCCAAGGACCGTCACATCAAAGTAG 
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ACGGACGTGGCCGGAGGATACGTATGCCGGCTATCTGCGCAGCACGTGTCTTCCAACTAA 

CACGTGAGTTACAACACAAATCGGACGGCGAGACTATAGAGTGGCTGCTCCAACAAGCGG 

AGCCAGCTATCATCGCAGCCACCGGAACTGGAACCATACCGGCGAATATCTCTACTTTGA 

ACATCTCTCTTCGAAGCAGTGGCTCTACTCTTTCAGCTCCACTGTCTAAATCTTTCCACA 

TGGGAAGAGCGGCTCAAAACGCTGCCGTTTTTGGGTTCCAGCAA 

ATGATATCACGACAGATTCTTCTTCTTCTTCTCTTCCC^^ 

TTTTTAAAGATCCTAATTTTCTAGATCAAGAACCCGGTTCAAGATCACCTAAACCGGGAT 
CCGAAGCTCCTGATCAAGATCCGGGTTCGACCCGGTCAAGAACACAAAATATGATACCGC 
CGATGTGGGCACTAGCGCCAACGCCAGCCTCCACAAACGGAGGTAGTGCTTTTTGGATGT 
TACCAGTCGGAGGAGGAGGAGGTCCGGCTAACGTTCAGGATCCATCACAGCACATGTGGG 
CGTTTAATCCGGGTCATTACCCGGGTCGAATCGGGTCGGTTCAGCTAGGGTCTATGTTAG 
TGGGAGGTCAACAGTTAGGGTTAGGTGTTGCAGAAAATAACAATTTGGGGCTATTTTCCG 
GCGGAGGAGGAGACGGTGGTCGGGTTGGTCTCGGT^TGAGTCTTGAGCAAAAGCCTCAAC 
ATCAAGTGAGTGATCATGCTACTAGAGACCAAAATCCTACTATAGATGGTTCTCCTTGAA 
AGACTTCATGATTTCTTTGGTTTTTAAAAAGTGTGAATGTGTGATTTATTGCAACTTTTG 
TTGAGGACTCCAATGTTAATATGGGTTTTAGGGTTGGCTTTTCGGGATTGCCAAATTGTT 

ATT 

>G804 Amino Acid Sequence (domain in AA coordinates: 54-117) 
MESHNNNQSNNNTTGSAHLVPSMGPISG 

KVDGRGRRII^PAIC^ARVFQIjTRELQHKSDGETIEWIiLQQAEPAIIAATGTGTIPANIS 

TIiNISLRSSGSTLSAPLSKSFHMGRAAQNAAVFGFQQQIjYHPHHITTDSSSSSLPKT^ 

EDLFKDPNFLDQEPGSRSPKPGSEAPDQDPGSTRSRTQI^IPPMWAIiAPTPASTNGGSAF 

WMLPVGGGGGPANVQDPSQHMWAFOT^ 

FSGGGGDGGRVGLGMSLEQKPQHQVSDHATRDQNPTIDGSP* 
>G1062 (297. .1781) 

CAAAAAAAAAGTTTCAATTTTTGAAAGCTCTGAGAAATGAAATCTATCATTCTCTCTCTC 

TATCTCTATCTTCCTTTTCAGATTTCGCTTCTTCAATTCATGAAA^ 

TTTAATGCTTCTCTTTTTTTAC^ 

GTTTTCAAACTTTTGCAGAAOT 

TTGCAGAATTTTCCTCTAAAGGTTCAGACTTTGGGGTAAAGGTGTCAACTTTGGCGATGG 
GTCTTGACGGAAAGAATGGTGGAGGGGTTTGGTTAAACGGTGGTGGTGGAGAAAGGGAAG 
AGAACGAGGAAGGTTCATGGGGAAGGAATCAAGAAGATGGTTCTTCTCAGTTTAAGCCTA 
TGCTTGAAGGTGATTGGTTTAGTAGTAACCAACCAC^TCCACAAGATCTTCAGATGTTAC 
AGAATCAJGCCAGATTTGAGATACTTTGGTGGTTTTCCTTTTAACCCTAATGATAATCTTC 
TTCTTCAACACTCTATTGATTCTTCTTCTTCTTGTTCTCCT^ 

ACCCTTCTGAGCAAAATCAGTTCTTGTCAACTAAC^CAACAAGGGTTGTCTTCTCAAT 
TTCeETCTTCTGO^AACCCTTTTGATAA^ 

TTAACGAAATCCATGCTCCTATTTCGATGGGGTTTGGTTCTTTGAC^ 

GGGATTTGAGTTCTGTTCCTGATTTCTTGTCTGCTCGGTCACTTCTTGCGCCGGAAAGCA 
ACAACAACAACACAATGTTGTGTGGTGGTTTCACAGCTCCGTT^ 

GTAGTCCTGCTAATGGTGGTTTTGTTGGGAACAGAGCGAAAGTTCTGAAGCCTTTAGAGG 

TGTTAGCATCGTCTGGTGCACAGCOTACTCTGTTCCAGAAACGTGCAGCTATGCGTCAGA 

GCTCTGGAAGCAAAATGGGAAATTCGGAGAGTTCGGGAATGAGGAGGTTTAGTGATGATG 

GAGATATGGATGAGACTGGGATTGAGGTTTCTGGGTTGAACTATGAGTCTGATGAGATAA 

ATGAGAGCGGTAAAGCGGCTGAGAGTGTTCAGATTGGAGGAGGAGGAAAGGGTAAGAAGA 

AAGGTATGCCTGCTAAGAATCTGATGGCTGAGAGGAGAAGGAGGAAGAAGCTTAATGATA 

GGCTTTATATGCTTAGATCAGTTGTCCCCAAGATCAGCAAAATGGATAGAGCATCAATAC 

TTGGAGATGCAATTOATTATCTGAAGGAACTTCTACAAAGGATCAATGATCT^ 

AACTTGAGTCAACTCCTCCTGGATCTTTGCCTCCAACTT 

GACCTACACCGCAAAGTCTTTCTTGTCGTGTCAAGGAAGAGTTGTGTCCCTCTTCTTTAC 
CAAGTCCTAAAGGCCAGCAAGCTAGAGTTGAGGTTAGATTAAGGGAAGGAAGAGCAGTGA 
ACATTCATATGTTCTGTGGTCGTAGACCGGGTCTGTTGCTCGCTACCATGAAAGCTTTGG 
ATAATCTTGGATTGGATGTTCAGCAAGCTGTGATCAGCTGTTTTAATGGGTTTGCCTTGG 
ATGTTTTC CGCGCTGAGCAATG CCAAGAAGGACAAGAGATACTGCCTGAT CAAATCAAAG 
CAGTGCTTTTCGATACAGCAGGGTATGCTGGTATGATCTGATCTGATCCTGACTTCGAGT 
C CATTAAGCATCTGTTGAAG (^GAGCTAGAAGAACTAAGTCCCTTTAAATCTGCAATTTT 
CTTCTCAACTTTTTTTCTTATGTCATAACTTCAATCTAAGC^ 
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GAX3AGTTGTTTTTAAATTAAGC 

TTCAACCTTTTATTAGCAATGTTAACTTCCATTTATGTTTCATCTT 

>G1062 Amino Acid Sequence (domain in AA coordinates: 308-359) 

MGLDGIWGGGVWIjNGGGGEREENEEGSWGRNQEDGSSQFKPMIjEGDWFSSNQPHPQDLQM 

LQNQPDFRYFGGFPFNPNDNLLIiQHSIDSSSSCSPSQAFSLDPSQQNQFIjSTNNNKGCLL 

NVPSSANPFDNAFEFGSESGFLNQIHAPISMGFGSLTQLGNI^LSSVPDFLSARSLLAPE 

SNNNNTMLCGG FTAPLELEGFGS PANGGFVGNRAKVLKPLEVLAS SGAQPTLFQKRAAMR 

QSSGSKMGNSESSGMRRFSDDGDMDETGIEVSGLNYESDEINESGKAAESVQIGGGGKGK 

KKGMPAKNLMAERRRRKKLITORIjYMI^ SKMDRAS ILGDAIDYLKELLQRINDLH 

NELESTPPGSLPPTSSSFHPLTPTPQTLSCRVKEELCPSSLPSPKGQQARVEVRIiREGRA 

WIHMFCGRRPGLLLATMKAIJDNLGIiDVQQAVISCFNGFAIiDVFRAEQCQEGQ 

KAVLFDTAGYAGMI * 

>G1322 (213 . .833) 

AAAGTTATTGATAGTTTCTGTTACTTATTAATTTTTAAGGTTATGTGTATTATTACCAAT 
TGGAGGACTATATAGTCGCAAGTCTCAACCCTATAAAAGAAAACATTCGTCGATCATCTT 
CCCGCCTCGAGTATCTCTCTCTCTCTCTCTCTTCTCTGTTTTCTTTATTGATTGCATAGA 
CAAAAATACACACATAGACAACAGAAAGAAAGATGG 

GAGTGAAAGCGACAATAACGTCACAGAAAGAAGAAGAAGGAACAGTGAGAAAAGGACCTT 

GGACTATGGAAGAAGATTTCATCCTCT*TTAATTACATCCTTAATCATGGTGAAGGTCTTT 

GGAACTCTGTCGCCAAAGCCTCTGGTCTAAAACGTACTGGAAAAAGTTGTCGGCTCCGGT 

GGCTGAACTATCTCCGACCAGATGTGCGGCGAGGGAACATAACCGAAGAAGAACAGCTTT 

TGATCATTCAGCTTCATGCTAAGCTTGGAAACAGGTGGTCGAAGATTGCGAAGCATCTTC 

CGGGAAGAACGGACAACGAGATAAAGAACTTCTGGAGGACAAAGATTCAGAGACACATGA 

AAGTGTCATCGGAAAATATGATGAATCATCAACATCATTGTTCGGGAAACTCACAGAGCT 

CGGGGATGACGACGC^GGC^GCTCCGGCAAAGC<^TAGACACGGCTGAGAGCrTCTCTC 

AGGCGAAGACGACGACGTTTAATGTGGTGGAACAACAGTCAAACGAGAATTACTGGAACG 

TTGAAGATCTGTGGCCCGTCCACTTGCTTAATGGTGACCACCATGTGATTTAAGATATAT 

ATATAGACCTCCTATACATTTATATGCCCCAGCTGGGTTTTTTTGTATGGTACGTTATTT 

GGTTTTTCTATTGCTGAAATGTCGTTGCATTTAATTTACATAC 

ATTAAATCTTCAATACATATGGAGGTGGTGTTTGAGTAAAAAAAAAAAAAA 

>G1322 Amino Acid Sequence (domain in AA coordinates : 26-130) 

METTMKKKGRVKATITSQKEEEGTVRKGPWTMEEDFILFNYILiraGE 

RTGKS CRIiRWXjNYLRPDVRRGNITEEEQLIjI I QLHAKLGNRWS KIAKHLPGRTDNE I KNF 
WRTKIQRHMKVSSENMMNHQHHCSGNSQSSGMTTQGSSGKAIDTAESFSQAKOT 
QQSNENYWNVEDIiWPVHLLNGDHHVI * 
>G1331 (1. .786) 

ATGGTGGAAGAAGTTTGGAGAAAGGGTCCATGGACCGCCGAAGAAGACCGTCTTTTGATC 
GAATACGTCCGTGTTCACGGTGAAGGTCGTTGGAACTCTGTCTCTAAACTCGCAGGATTG 
AAAAGGAATGGCAAAAGCTGCAGACTAAGATGGGTGAATTACCTTAGACCAGACCTCAAG 
AGAGGACAGATCACTCCACATGAAGAAAGTATAATACTTGAGCTACACGCTAAGTGGGGA 
AATAGGTGGTCAACAATTGCACGTAGTTTACCAGGAAGAACAGACAATGAGATCAAGAAC 
TATTGGAGAACCCATTTCAAGAAAAAGGCAAAGCCTACGACTAACAATGCGGAGAAGATA 
AAGAGTCGTCTCCTAAAAAGGCAACACTTCAAGGAACAGAGAGAAATAGAGTTGCAACAA 
GAACAGCAGTTGTTTCAGTTCGACCAACTCGGTATGAAAAAGATCATCTCTTTACTCGAA 
GAAAACftATAGCAGTAGCAGTAGCGATGGCGGTGGTGATGTGTTCTATTATCCTGATCAA 
ATAACACATTCATCAAAACCCTTTGGCTAT^ 

GGTAGATTTTCTCCTGTAAACATACCTGATGCTAATACTATGAACGAAGACAATGCCATA 
TGGGACGGGTTTTGGAACATGGATGTTGTAAATGGACATGGTGGGAACTTGGGTGTTGTG 
GCTGCTACTGCTGCTTGTGGCCCAAGGAAGCCCTATTTCCATAACTTGGTGATTCCATTT 
TGTTAA 

>G1331 Amino Acid Sequence (conserved domain in AA coordinates : 8-109) 

MvlSETVWRKGPWTAEEDRLLIEYVRVHGEGRWNSV 

RGQITPHEESIILELHAKWGiraWSTIARSLPGRTDNEIKirYWRTH^ 

KSRliLKRQHFKEQREIEIiQQEQQLFQFDQLGMKKIISLLEENWSSSSSDGGGDVFYYPDQ 
ITHSSKPFGYNSNSLEEQLQGRFSPVNIPDANTMNEDNAIWDGFW^ 
AATAACGPRKP YFHNLVI P FC * 
>G1521 (1..891) 
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ATGCCTCCATTACCGTCCTCCACGGCGCCTTCGTCTTCGAGACATCTTCGATCGCCGGAA 
AGTATCGCGAAATTTGCAGGGAGAGCAATATTTCCTGCTTTACAGGGGAAATCGTGTCCG 
ATATGCCTCGAAAATCTAACCGAGCGAAGATCCGCCGCCGTGATCACGGTGTGCAAGCAC 
GGATACTGCCTTGCTTGTATTCGGAAGTGGAGCAGCTTCAAGAGGAATTGTCCTCTTTGT 
AACACTCGTTTTGATTCCTGGTTTATCGTTAGTGATTTTGCTTCTAGAAAATACCATAAG 
GAGC^TTAC<^TTCTTCGTGATCGTGAGACTTTAACTTATCATCGGAATAATCCTTCC 
GATCGCCGGAGGATAATTCAAAGGTCGAGGGATGTTTTGGAAAACTCTAGCTCAAGATCA 
AGGCCATTGCCATGGCGGAGATCATTTGGACGACCAGGTTC^GTTCCTGATTCTGTTATC 
TTCCAGCGAAAGCTTCAGTGGCGAGCTAGCATATACACTAAGCAATTACGAGCTGTTCGA 
TTACATTCAAGGCGCTTGGAACTAAGTTTGGCGGTGAATGA 

ACTGAAAGAATTGAGCCATGGATTAGAAGAGAGCTTCAGGCAGTCCTTGGAGATCCTGAT 
CCCTCAGTTATTOTTCATTTTGCGTCAGCT 

AATCGACAAACCGGGCAGACCGGGATGTTGGTGGAAGATGAAGTCTCCTCTCTTCGAAAA 

TTCTTGTCTGATAAGGTGGATATATTTTGGCATGAACTAAGATGTTTTGCGGAGAGTATA 

CTCACGATGGAGACTTATGATGCAGTGGTTGAATACAATGAGGTGGAGTAA 

>G1521 Amino Acid Sequence (domain in AA coordinates: 39-80) 

MPPLPSSTAPSSSRHLRSPESIAKFAGRAIFPALQGKSCPICLENIjTERRSAAVITVCKH 

GYCLACIRKWSSFKRNCPL^TRFDSWFIVSDFASRKYHKEQLPIIiRDRE 

DRRRI IQRSRDVLENS S SRSRPLPWRRS FGRPGSVPDS VT FQRKLQWRAS I YTKQIiRAVR 

LHSRRLELSLAVNDYTKAKITERIEPWIRREIjQAVLGDPDPSVIVHFASALFIKRLER^ 

NRQTGQTGMLVEDEVS S LRKFLSDKVD I FWHELRCFAES ILTKETYDAVVE YlSTEVE * 

>G183 (1..1458) 

ATGAGTGATTTTGATGAAAACTTCATCGAAATGACGTCGTATTGGGCTCCACCATCCAGT 

CCTAGCCCAAG7UVCGATATTGGCAATGCTGGAGCAAACCGACAATGGTCTGAATCCAATC 

AGTGAGATCTTCCCTCAAGAAAGCTTGCCAAGAGATCATACTGATCAATCTGGACAAAGA 

TCTGGTCTTCGTGAGAGACTGGCTGC^AGAGTAGGATTCAATCTTCCJ^CACTCAATACA 

GAAGAAAACA1X3AGTCCTTTGGATGCATTTTTCAGGAGCTCGAATGTTC 

GTCGTTGCAATCTCTCCAGGATTCAGTCCATC^ 

AGTGATTCTTCCCAGATTATCCCTCCGTCTTCAGCCACCAATTACGGACCTCTAGAGATG 
GTGGAAACTTCCGGTGAAGACAATGCAGCGATGATGATGTTCAACAACGATCTTCCTTAT 
CAGCCGTACAATGTTGATCTGCCTTCTCTAGAAGTCTTTGATGATATTGCAACGGAAGAG 
TCCTTTTATATCCCATCTTATGAACCTCATGTTGACCCAATTGGAACTCCTTTAGTCACA 
TCCTTTGAATCTGAACTCGTTGACGATGCCCATACCGACATCATCTCCATTGAGGACAGT 
GAGAGCGAGGATGGAAACAAAGATGATGACGACGAGGACTTCCAATACGAAGACGAAGAC 
GAAGACCAATACGACCAAGATCAAGATGTAGATGAAGATGAAGAGGAAGAAAAAGATGAA 
GACAATGTTGCATTAGATGATCCTCAACCTCCACCTCCAAAGAGAAGGAGATATGAGGTA 
TCAAACATGATTGGAGCCACAAGAACAAGCAAGACACA^ 

AGCGACGAAGACAATCCTAACGATGGTTATCGCTGGAGAAAATACGGTCAGAAAGTCGTC 
AAAGGAAATCCTAATCCGAGGAGTTACTTCAAGTGCACAAACATCGAGTGCAGAGTGAAA 
AAACATGTGGAGAGAGGAGCAGACAATATCAAGTTGGTTGTGACTACATACGATGGGATA 
CAC^CCATCCTTCACCACCTGCACGTAGAAGCAATTCCAGTTCAAGGAACCGGTCTGCA 
GGGGCAACAATACCTCAAAATCAGAATGATCGAACCAGTCGGTTAGGTAGGGCTCCTCCT 
ACTCCTACTCCTCCTACTCCTCCTCCTTCGTCTTACACACCTGAGGAGATGAGGCCTTTC 
TCTTCGTTGGCTACAGAAATTGATCTGACAGAGGTTTATATGACCGGAATCTCTATGCTG 
CCGAATATACCGGTTTACGAGAATTCGGGTTTTATGTACCAGAATGATGAACCGACGATG 
AATGCGATGCCGGATGGTTCAGATGTGTACGATGGGATCATGGAACGCCTGTATTTTAAG 
TTTGGTGTCGACATGTAG 

>G183 Amino Acid Sequence (domain in AA coordinates: TBD) 

MSDFDENFIEMTSYWAPPSSPSPRTIIiAMLEQTDNGLNPISEIFPQESIjPRDHTDQSGQR 

SGLRERLAARVGFNLPTLOTEElWSPIiDAF 

SDSSQIIPPSSATNYGPLEIWETSGEDNAAMMMFNNDLPYQPYNVDLPSLEVFDDIATEE 
SFYIPSYEPHVDPIGTPIiVTSFESELVDDAHTDIISIEDSESEDGNKDDDDEDFQYEDED 
EDQYDQDQDVDEDEEEEKDEDNVALDDPQPPPPKRRRYEVSNMIGATRTSKTQRIILQME 
SDEDNPNDGYRWRKYGQKWKGNPN^ 

HNHPSPPARRSNSSSRNRSAGATIPQNQNDRTSRLGRAPPTPTPPTPPPSSYTPEEMRPF 

SSLATEIDLTEVYMTGISMI»PNIPVYENSGF 

FGVDM* 
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>G2555 (177.. 956) 

CTGTTTTTGTATCCGTGTAAATTAATCACACGGTAGTTTTTGATGAAAAGACAACAATCG 
GAGAACAATCTGGTCTGCTGCTAAAATTTAATAAATTGTTTTGTCrAATTGTCTCCACCC 
ATAAAAAAGCGCGAATTCAATTCACCGACTAAAGACATTCTCCGGTGGAGACCCCGATGC 
AATCCACTCATATAAGCGGCGGAAGTAGCGGTGGTGGTGGTGGAGGAGGAGGAGAGGTGA 
GTCGAAGTGGATTATCTCGGATCCGTTCAGCTCCAGCTACTTGGATTGAAACCCTACTCG 
AAGAAGATGAAGAAGAAGGTTTAAAACCTAACCTTTGTTTAACAGAGCTGCTTACTGGTA 
ATAATAACTCTGGAGGAGTGATAACGAGTCGTGACGACTCGTTCGAGTTCCTGAGTTCTG 
TTGAGCAAGGATTGTATAATC^TCATCAAGGTGGTGGCTTTCACCGTCAGAATAGTTCTC 
CGGCTGATTTTCTTAGTGGGTCTGGTTCTGGGACTGATGGGTATTTCTCTAATTTTGGTA 
TTCCGGCX5AATTATGACTATTTGTCGACCAACGTTGATATTTCTCCGACTAAACGGTCTA 
GAGATATGGAAACACAGTTTTCTTCTCAGCTGAAAGAAGAGCAAATGAGTGGTGGGATAT 
CAGGAATGATGGATATGAAC^TGGACAAGATTTTTGAGGATTCAGOT 

GTGCTAAACX5TGGTTGTGCTACTCATCCTCGTAGCATTGCTGAACGGGTGAGAAGAACGC 
GAATAAGTGATCGGATTAGGAGGCTGCAAGAGCTTGTTCCTAACATGGATAAGCAAACCA 
ACACTGCAGACATGTTGGAAGAAGCTGTGGAGTATGTGAAGGCTCTTCAAAGCCAGATCC 
AGGAATTGACAGAGCAGCAGAAGAGATGCAAATGCAAACCn?A2^ 

TCCTTTAGGATTTGATATATCTGTATTTTATTTTTGTACTATOTAAAAATGGTGATGATC 
TGTTCGAAAATTCGAAACATGATCTTATATATTGAACTAGAAAAAATAGATATATATGAA 
TTTTAGCTGTAAAATTTTTGTACAATAAGGAGAAAAAGATTTAGAAGAGTCAATAAAAAG 
ATGATGTTTACAAGTCAAAAAAAAAAA 

>G2555 Amino Acid Sequence (domain in AA coordinates: 175-245) 
MQSTHISGGSSGGGGGGGGEVSI^GLSRIRSAPATWIETIiLEEDEEEGLKPNIjCIiTELLT 
GNITNSGGVITSRDDSFEFIiSSVEQGIjYNHHQGGGFHRQNSSPADFLSGSGSGTDGYFSNF 
GIPAira)YLSTNVDISPTKRSRDMETQFSSQLKEEQMSGGISGM^ 

VRAKRGCATHPRS I AERVRRTRI SDRI RRLQELVPN^IDKQTNTADI , Q^EAVEYVKALQSQ 

IQELTEQQKRCKCKPKEEQ* 

>G375 (53.. 1171) 

TCGACAAAAACTCTCACTCTCCCTCAAACTAAACAAACATACAGAACACAAAATGGGTC^ 
CACTTCTCTTCAAGTTTGCATGGATTCTGATTGGCTCCAGGAATCCGAGTCATCAGGAGG 
AAGCATGTTAGACTCTTCAACGAATTCTCCGTCAGCAGCCGACATACTAGCAGCTTGCAG 
(^CTAGACCA<^^GCCTCGGCCGTGGCTGTAGCCGCTGCAGCTCTGATGGACGGTGGAAG 
GAGGCTGCGTCCACCTCACGACCATCCTCAAAAGTGTCCTCGTTGCGAGTCAACACATAC 
TAAGTTCTGTTACTACAATAACTACAGCCTCTCTCAGCCTCGTTACTTCTGCAAGACTTG 
TCGCCGTTACTGGACAAT^AGGCGGAACTCTAAGGAATATTCCGGTTGGTGGTGGATGCCG 
TAAAAACAAGAAACCATCTTCCTCTAATTCCTCCTCCTCC^CTTCn^CCGGCAAAAAACC 
ATCCAACATCGTTACCGCCAATACCTCTGATCTTATGGCTTTAGCA(^TTCTC^TCAAAA 
TTACCAACATTCTCCTCTAGGGTTTTCACATTTTGGTGGGATGATGGGGTCTTACTCAAC 
TCCGGAGCATGGTAACGTTGGTTTCTTGGAGAGCAAGTATGGCGGTTTGCTTTCGCAGAG 
CCCTAGACCTATTGATTTCTTGGACAGTAAGTTTGATCTCATGGGAGTGAACAATGACAA 
CCTGGTCATGGTTAATCATGGAAGTAACGGAGATCATCATCATCATCATAATCATCACAT 
GGGTCTGAATCACGGTGTAGGTCTTAACAACAACAACAACAATGGTGGATTTAATGGGAT 
TTCTACGGGAGGCAATGGAAATGGTGGTGGTCTCATGGATATATCGACATGCCAAAGACT 
TATGCTATCTAATTATGATCATCACCATTACAATCATCAAGAAGATCATCAAAGGGTAGC 
AACAATAATGGATGTGAAGCCAAATCCGAAGTTGTTATCGCTTGATTGGCAGCAAGATCA 
ATGCTACTCCAATGGTGGTGGTAGCGGAGGCGCAGGAAAATCCGACGGTGGTGGATACGG 
CAATGGTGGTTATATCAACGGTTTAGGTTCGTCGTGGAATGGTTTGATGAATGGCTATGG 
AACGTCCACTAAAAeAAACTCCTTGGTTTGATAAGTTAATCAGAACTTCTTTTTTCTTGT 
CGTCATCAACTAGTAGTAGTAGTAATAGTAGTTGGAGACTAGAGAAGCACTTCAAATTAT 
TTATGGGTTTGTTTGCTAAGCCAGTTTTAC 

>G375 Amino Acid Sequence (domain in AA coordinates: 75-103) 
MGLTSLQVCITOSDWLQESESSGGSMLDSSTNSPSAADILAAC 

GGRRIjRPPHDHPQKCPRCESTHTKFCYYNNYSIjSQPRYFCKTCRRyWTKGGTIjRNIPVGG 
GCRJCNKICPSSSNSSSSTSSGKKPSNIVTANTSDLMAIjAHSHQNYQHSPLGFSHFGGMMGS 
YS TPEHGNVGFLES KYGGLLSQS PRP IDFLDS KFDLMGVNNDNLVMVNHGSNGDHHHHHN 
HHMGLNHGVGLNNNNNNGGFNGi S TGGNGNGGGLMDI S TCQRIjMLSNYDHHHYNHQEDHQ 
RVAXll^VKPNPKLLSLDWQQDQCYSNGGGSGGAGKSDGGGYGNGGYlNGLGSSWNGIiMN 
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GYGTSTKTNSIiV* 
>G1007 (86. .763) 

ATTCCTTCTTGCCTAGGAACTAATTGTTGCACACTTC 

CGACATCAAAACGAGAGAGAAAAGAATGGTGGATTCTCATGGCTCCGACACGGAATGTTC 
CTCCAAGAAGAAAAAGGAGAAAACGAAAGAAAAGGGGGTATATCGTGGGGCTCGCATGAG 
GAGCTGGGGGAAATGGGTCTCGGAGATTCGGGAGCCCCGTAAGAAATCAAGAATCTGGCT 
CGGGACTTTCCCCACGGCGGAGATGGCAGCGCGTGCCCATGATGTTGCGGCATTGAGTAT 
CAAAGGAAGTTCCGCAATCCTTAACTTCCCTC 

CTCGCTCAGCG2^CAGGATATCCAGGCCGCAGCCGCCGAAGCCGCTCTTATGGATTTCAA 

AACTGTACCATTCCATCTTCAGGATGACTCAACGCCGTTGCAAACTAGGTGTGATACTGA 

GAAGATCGAT^AAGTGGTCATCCTCATCGTCCTGAGCCTCATCCTCATCCTCATCTTCGTC 

CTCGTCCTCATCATCTATGCTTTCGGGGGAGCTAGGAGATATTGTGGAGTTGCCGAGTCT 

TGAAAACAATGTAAAATACGATTGTGCGCTGTATGACTCGTTGGAGGGGCTGGTGTCGAT 

GCCCCCATGGTTAGATGCTACCGAAAATGATTTTAGGTATGGAGATGATTCGGTACTGTT 

GGACCC^TGTCTC^AAGAAAGCTTTTTGTGGAATTATGAGTAAGGTTTTTTTTTGGAAAG 

AAATGTGGTTTTTTGTTTCCTCCTCTCTTTTAT^ 

ATATCTTCTAC^TATGTAATACTTTTC^^ 

AAAAAAAAAAAAAAAAAAAAAAA 

>G1007 Amino Acid Sequence (domain in AA coordinates: TBD) 
MVDSHGSOTECSSKKKKEKTKEKGVYRGARMRSWGKW^ 

AARAHDVAAIiS IKGSSAIIiNFPELADFIjPRPVSIjSQQD IQAAAAEAAIiMDFKTVPFHLQD 
DSTPIiQTRODTEKIEKWSSSSSSASSSSSSSSSSSSSML^^ 
ALYDSLEGLVSMPPWIiDATENDFRYGDDSVLLDPCIiKESFLWNYE* 
>G1010 (344. .1276) 

ATTCTTCTTCTAAAAAATCTTGACAA.CTTTTTGTTTTTGTTTTCTTTCTCTGAATTTTTT 

AAAAGAGAGAGAGCTATGTAGCTATGAAACAGTAAGAGATATAGATATAGAGAGACAGAG 

AAAGATG ATGATCAGTGAAGTTAGGCTAAAC C CACTTTCT ATTTATGTATAATTAGGTCA 

ATCACATCACCAATCTCCTCCTCCAATTCTCCTCCTCTCCTTCCAAATTCTAGGGTTTTG 

CTTGTATCTCACCCCCTTTCTCAATTCCCTAGGGAAACTGTGAATTTCATCAAATTCCAT 

TATTTTTTGGTCACACCCTTAAAGAGATCTGAGAGTTCTAAAGATGATGACAGATTTATC 

TCTCACGAGAGATGAAGATGAAGAAGAAGCAAAGCCCTTAGCAGAAGAAGAAGGAGCGCG 

TGAAGTAGCAGACAGAGAGCACATGTTCGACAAAGTTGTGACTCCAAGTGATGTCGGAAA 

ACTAAACCGACTTGTGATCCCAAAGCAACACGCAGAGAGATTCTTCCCTTTAGATT 

TTCAAACGAGAAAGGTTTGCTTTTAAACTTCGAAGATCTCACTGGCAAATCTTGGAGGTT 

CCGTTACTCTTACTGGAAC^GTAGTCAAAGCTATGTCATGACTAAAGGTTGGAGCAGATT 

CGTTAAAGACAAAAAGCTTGACGCCGGAGATATTGTCTCTTTCCAAAGATGTGTCGGAGA 

TTCAGGAAGAGATAGCCGTTTGTTTATTGATTGGAGGAGAAGACCTAAAGTCCCTGACCA 

TCCTCATTTCGCCGCCGGAGCTATGTTCCCTAGGTTTTACAGCTTTCCTTCGACCAATTA 

CT^GTCTTTATAATCATCAGCAGCAACGTCATCATCACAGTGGTGGTGGTTATAATTATCA 

TCAAATTCCGAGAGAATTTGGTTATGGTTACTTCGTTAGGTCAGTGGATCAGAGGAACAA 

TCCTGCGGCTGCGGTGGCTGATCCGTTGGTGATTGAATCTGTGCCGGTGATGATGCACGG 

GAGAGCTAATCAGGAACTTGTTGGAACGGCCGGGAAGAGACTGAGGCTTTTTGGAGTTGA 

TATGGAATGCGGCGAGAGCGGAATGACCAACAGTACGGAGGAGGAATCATCATCTTCCGG 

TGGAAGTTTGCCACGTGGAGGCGGTGGTGGTGCTTCATCTTCCTCTTTCTTTCAGCTGAG 

ACTTGGAAGCAGCAGTGAAGATGATCACITCACTAAGAAAGGAAAGTCTTCATTGTCTTT 

TGATTTGGATCAATAATAATGATGATGATGAAATTAGTTGGTATTTTAAGAAAAAAAACA 

TAC^TATATAATTCTATATATATGACAACATAATGCATTGATTTCCTT 

>G1010 Amino Acid Sequence (domain in AA coordinates: 33-122) 

MMTDLSLTRDEDEEEAKPLAEEEGAREVADREHMFDf^^ 

FPLDSSSNEKGIjLIiNFEDLTGKSWRFRYSYW^ 

QRCVGD SGRD SRLF IDWRRRPKVPDHPHFAAGAMFPRFYS F PSTNYS LYNHQQQRHHHSG 
GGYNYHQ I PREFGYG YFVRS VDQR10JPAAAVADPLVI ES VP VMMHGRANQEIjVGTAGKRL 
RLFGVDMECGESGMTNSTEEESSSSGGSLPRGGGGGASSSSFFQLRIiGSSSEDDHFTKKG 
KSSLSFDIiDQ* 
>G1014 (174 • .1112) 

CACAAACCACAGTCTCTCTTTCTCTCTCTATCTATCTTCTCTTTCTCTCTCTATCTCTAT 
CACTGAAACCCAAAGAGATCCACCATTTGTTCTC 
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CTTCC^CACTTCCTTTTTACTAGGCT^GTGTTAACCAATTGAGAGAGAAAAATGATGGTTG 
ATGAAAATGTGGAAACCAAGGCCTCTACTTTAGTGGCAAGTGTTGATCATGGGTTTGGAT 
CCGGGTCGGGTCATGATCATCATGGGTTATCGGCGTCTGTGCCTCTTCTTGGTGTTAACT 
GGAAGAAGAGAAGGATGCCTAGACAGAGACGATCTTCTTOTTCCTTTAACCTTCTCTCTT 
TCCCTCCTCCTATGCCTCCTATTTCCGACGTGCCAACTCCTCTCCCCGCACGTAAAATTG 
ACCCAAGAAAGCTAAGATTCOTCTTCCAAAAGGAACTCAAGAACAGTGACGTCAGCTCTC 
TCCGACGTATGATACTCCCGAAGAAAGCCGCGGAGGCTCACTTGCCGGCACTTGAATGCA 
AGGAAGGGATTCCTATAAGAATGGAAGATTTGGACGGTTTTCACGTTTGGACCTTCAAGT 
ATAGGTACTGGCCAAACAACAATAG CAGAATGTACGTGCTAGAAAACACAGG CGATTTTG 
TGAATGCTCATGGTCTGCAGCTAGGTGACTTCATCATGGTTTACCAAGATCTCTACTCAA 
ACAATTACGTTATACAAG CAAGAAAAG CATCGGAAG AAGAAGAAGTAGACGTAATCAATC 
TTGAAGAAGACGACGTTTACACAAACTTAACAAGGATCGAAAACACTGTGGTTAACGATC 
TTCTCCTCCAAGATTTTAATCATCACAACAAGAA.CAACAACAACAA 

GCAACAAATGTTCTTACTATTATCCAGTCATAGATGATGTCACCACAAACACAGAGTCTT 

TTGTCTACGACACGACGGCTCTTACCTCCAACGATACTCCTCTCGATTTTTTGGGTGGAC 

ATACGACGACTACTAATAATTATTACTCCAAGTTCGGAACATTCGATGGTTTGGGCTCCG 

TTGAGAATATCTCTCTCGATGACTTCTACTAGATAATCAATCGATGGGCT'GATGGTATTC 

TTGATGGTGATCAGCTATTTAATATCCTTATAATATATATAAGAATTAAATGCAATTTGC 

ATATATATTATCAAGTGTTGTAATATAACATTACAGTTTAAAAAAAAAAAAAAAAAA 

>G1014 Amino Acid Sequence (domain in AA coordinates: 90-172) 

MVDENVETKASTLVAS VDHGFGSGSGHDHHGL^ 

LSFPPPMPPISHVPTPLPARKIDPRKLRFLFQKELKNSDVSSIJ^^ 

ECKEGI PIRMEDLDGFHVWTFKYRYWPlT^SRMYVIiEOT 

YSEflT^IQARKASEEEEVDVINLEEDDW^ 

SNSNKCS YYYPVIDDVTTNTES FVYDTTALTSITOTPLDFLGGHTTTTITNYY^ KFGTFDGL 

GSVENISLDDFY* 

>G1035 (103.. 624) 

CCATAATAATATATTAAAACTATATACTATAATCTTTTTACATAATAAACTTTGGGTCCT 
GCGTCTTAATCATAGTACTTAATTTTCTCTGTGTGTTTTAATATGAATAATAAAACTGAA 
ATGGGATCTTCCACAAGTGGAAATTGCTCGTCGGTTTCAACCACTGGTTTAGCTAACTCC 
GGTTCAGAATCTGATCTCCGGCAACGTGATCTAATCGACGAGCGGAAGAGAAAGAGGAAA 
CAGTCGAACAGAGAATCTGCGAGGAGGTCGAGGATGAGGAAGCAGAAGCATTTGGATGAT 
CTCACTGCTCAGGTGACTCATCTACGTAAAGATVAACGCTCAGATCGTCGCCGGAATCGCC 
GTCACGACGCAGCACTACGTCACTATCGAGGCGGAGAACGACATTCTCAGAGCTCAGGTT 
CTTGAACTTAACCACCGTCTCCAATCTCTTAACGAGATCGTTGATTTCGTCGAATCTTCT 
TCTTCAGGATTCGGTATGGAGACCGGTCAGGGATTATTCGACGGTGGATTATTCGACGGC 
GTGATGAATCCTATGAATCTAGGGTTTTATAATCAACCAATCATGGCTTCTGCTTCTACT 
GCTGGTGATGTTTTCAACTGTTAGAAAACTTCAC^^ 

TCATCGCAGCAGGGGTAAAACTGTAATTTTTCTTATAAATTATGTGATGATGCTTTGTTT 
CTTTATTTTATAAGATGGTTAATTAGTGTTTAAAACTGATTGTAATGATAGACAGTGTAA 
GAAATGTGTGATATGATGGAGATGGTGATGTGAGTTTGGTACAAATATTTTAAGATCTTT 
TCTTTCTATATATTAAAAGTGAAGAT^TAATATTTTGTCATTTTCTTAAAAAAAAAAAAA 
AAA 

>G1035 Amino Acid Sequence (domain in AA coordinates: 39-91) 
MNNKTEMGSSTSGNCSSVSTTGIiANSGSESDLRQRDLIDERKRKR 
QKHLDDLTAQVTHIiRKENAQIVAGIAOTTQHYVTIEAENDILRAQVLEL 
DFVES SSSGFGl^TGQGLFDGGLFDGVI^PMNIiGFYNQPI MAS ASTAGDVFNC * 
>G1046 (1..567-)- 

ATGATTAGACATCTAAAACCCTACATGGAGTCGTCTAGTGTCCATCGCTCTCATTGTTTC 
GATATTCTTGATGGAGTCCCACTACACGACGATCATTTCAACTCGGCATTCCTACCAAAC 
ACTGACTTTAATGTTCATTTGCAGTCAAACGTATCGACCCG 

TTAGACCCAAATGCAGAAAACATTTTCCATAACGAAGGTCTTGCTCCAGAAGAAAGAAGA 

GCAAGAAGAATGGTCTCTAACCGGGAATCTGCAAGGAGGTCACGTATGCGCAAAAAGAAG 

CAGATCGAAGAGCTGCAACAACAAGTTGAACAACTCATGATGTTGAATCATCACTTGTCT 

GAGAAAGTCATCAACTTGTTGGAAAGCAACCATCAGATCCTACAAGAGAACTCACA 

AAAGAGAT^GTCTCTTCCTTTCACTTGCTCATGGCAGATGTGCTATTACCCATGAGAAAT 

GCAGAGAGCAACATCAATGACCGCAATGTGAATTATCTAAGAGGAGAACCATCAAACCGT 
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CCCACCAACAGTCCCTTTGGTAAGTAA 

>G104 6 Amino Acid Sequence (conserved domain in AA coordinates : 79-138) 

MIRHLKPYMESSSVHRSHCFDILDGV^LHDDH 

LDPNAENI FHNEGLAPEERRARRMVSNRESARRSRMRKKKQI^ 

EKVINI.LESNHQILQBNSQLKEKVSSFHIjIiMADVXI.PMim 

PTNSPFGK* 

>G1049 (29.. 550) 

CTAACTTTCCTCCCAAGTAAACTTCAAAATC 

TAACTACCTAAACTCATCGATACTGC^GTCTCCGTATCCTTCTAATTTCCCGATATCTAC 
GCCATTTCCAACCAACGGTCAAZyVCCCGTACCT 

CAATCCACAATCCATGAGCCTAAGCAGCAACAACTCAACATCAGATGAAGCAGAAGAGCA 

GCAGACGAACAACAATATAATCAACGAGCGGAAGCAGAGAAGGATGATTTCAAACCGAGA 

ATCCGC^\AGGAGATCGCGTATGAGGAAGCAAAGACACCTTGACGAGCTTTGGTCACAAGT 

GATGTGGTTAAGGATCGAGAATCATCAGTTGCTTGATAAGCTTAACAATCTCTCTGAGTC 

TCACGACAAGGTTCTTCAAGAGAATGCTCAGCTTAAAGAAGAAACATTTGAGCTTAAGCA 

AGTGATCAGCGATATGCAAATTCAAAGCCCTTTCTCTTGCTTTA 

CATTGAATAAAGCATTTTTCCCCGATTC^^ 

TCTTTGTATGTATATGTGGAGATGTATTTCAGGGTTTTGATAATATGACCCTTTACGACG 
ACGTTTTTAGATTGTAGTAAATTTATAAACTAAAGAAGATTAGTGTTAATG 

TATAA 

>G1049 Amino Acid Sequence (domain in AA coordinates 77-132) 
MQPQTDVFSLHNYLNSSILQSPYPSNFPISTPFPTNGQNPYLLYGFQSPTNNPQSMSLSS 
NNSTSDEAEEQQTNNNI INERKQRRMI SNRESARRSRMRKQRHLDELWSQVMWLRIENHQ 
LLDKLNNLSESHDKVLQENAQLKEETFELKQVISDMQIQSPFSCFRDDIIPIE* 
>G1069 (89.. 934) 

TTGGAACCCTAGAGGCCTTTCAAGCAAATCATCAGGGTAACAATTTCTTO 
TTAGCGAATTTCCAGTTTTTGGTCAATCATGGCAAACCC 

TTTAGCGGGCATGGTGGACCATTCGGTCTCCTCAGGCCATCACCAAAACCATCACCACCA 
AAGTCTTCTTACCAAAGGAGATCTTGGAATAGCCATGAATCAGAGCCAAGACAACGACCA 
AGACGAAGAAGATGATCCTAGAGAAGGAGCCGTTGAGGTGGTCAACCGTAGACCAAGAGG 
TAGACCACCAGGATCCAAAAACAAACCCAAAGCTCCAATCTTTGTGACAAGAGACAGCCC 
CAACGCACTCCGTAGCCATGTCTTGGAGATCTCCGACGGCAGTGACGTCGCCGACACAAT 
CGCTCACTTCTCAAGACGC^GGCAACGCGGCGTTTGCGTTCTCAGCGGGACAGGCTCAGT 
CGCTAACGTCACCCTCCGCCAAGCCGCCGCACCAGGAGGTGTGGTCTCTCTCCAAGGCAG 
GTTTGAAATCTTATCTTTAACCGGTGCTTTCCTCCCTGGACCTTCCCCACCCGGGTCAAC 
CGGTTTAACGGTTTACTTAGCCGGGGTCCAGGGTCAGGTCGTTGGAGGTAGCGTTGTAGG 
CCCACTCTTAGCCATAGGGTCGGTCATGGTGATTGCTGCTACTTTCTCTAACGCTACTTA 
TGAGAGATTGCCCATGGAAGAAGAGGAAGACGGTGGCGGCTCAAGACAGATTCACGGAGG 
CGGTGACTCACCGCCCAGAATCGGTAGTAACCTGCCTGATCTATCAGGGATGGCCGGGCC 
AGGCTACAATATGCCGCCGCATCTGATTCCAAATGGGGCTGGTCAGCTAGGGCACGAACC 
ATATACATGGGTCCACGCAAGACCACCTTACTGACTCAGTGAGCCATTTCTATATATAAT 
GGTCTATATAAATAAATATATAGATGAATATAAGCAAGCAATTTGAGGTAGTCTATTACA 
AAGCTTTTGCTCTGGTTGGAAAAATAAATAAGTATCAAAGCTTTGTTTGTTCTTAATGGA 
AATATAGAGCTTGGGAAGGTAGAAAGAGACGACATT 

>G1069 Amino Acid Sequence (domain in AA coordinates: 67-74) 

MANPWWTNQSGLAGMVDHSVSSGHHQNHHHQSLLTKGDLGIAMNQSQDNDQDEEDDPREG 

AVEv^TNRRPRGRPPGSKNKPKAPIFVTTRDSPNALRSHvT^ISDGSDVADTIAHFSRI^Q^ 

GVCVXSGTGSVANV^LRQAAAPGGVVSLQGRFEILSLTGAFLPGPSPPGSTGLTVYIiAGV 

QGQWGGSWGPLLAIGSVMVIAATFSNATYERLPMEEEEDGGGSRQIHGGGDSPPRIGS 

NLPDLS GMAGPGYNMPPHLI PNGAGQLGHEPYTWVHARPPY * 

>G1070 (170. .1144) 

TCGACCAGCTTGGATTTCGTTGTTCATCATTACTACTCTCTTTCTTCTTCTAGCTAGCTA 

GTTTTGACAGCAAAATAAGAAGGAAAAAAAAGGTG^ 

CACTCTCTTCITCTTTTTT^^ 

ACAATCTCATGGATGACAAAGCTCTCTACCTC^ 

ACATCTTC^U^CAACAGCAACAAGAGTTCTTCCT 

AACCGATGGTGACCAACAAGGAGGATCAGGAGGAAACCGACAAATCAAGATGGATCGTGA 
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AGAGACAAGCGACAACATAGACAACATAGCTAACAACAGCGGTAGTGAAGGTAAAGACAT 

AGATATACACGGTGGTTCAGGAGAAGGAGGTGGTGGCTCCGGAGGAGATCATCAGATGAC 

AAGAAGACCi\AGAGGAAGACCAGCGGGATCGAAGAACAAACCAAAACCACCGATTATCAT 

CACACGGGACAGCGCAAACGCGCTTAGAACCCACGTGATGGAGATCGGAGATGGCTGCGA 

CTTAGTCGAAAGCGTTGCCACTTTTGCACGAAGACGCCAACGCGGCGTTTGCGTTATGAG 

CGGTACTGGAAATGTTACTAACGTCACTATACGTCAGCCTGGATCTCATCCTTCTCCTGG 

CTCGGTAGTTAGTCTTCACGGAAGGTTCGAGATTCTATCTCTCTCAGGATCTTTTCTCCC 

TCCTCCGGCTCCTCCTACAGCCACCGGATTGAGTGTTTACCTCGCTGGAGGACAAGGACA 

GGTGGTTGGAGGAAGCGTAGTTGGTCCGTTGTTATGTGCTGGTCCTGTCGTTGTCATGGC 

TGCGTCTTTTAGCAATGCGGCGTACGAAAGGTTGCCTTTAGAGGAAGATGAGATGCAGAC 

GCCGGTTCATGGCGGAGGAGGAGGAGGATCATTGGAGTCGCCGCCAATGATGGGACAACA 

ACTGCAACATCAGCAAGAAGCTATGTCAGGTCATCAAGGGTTACCACCTAATCTTCTTGG 

TTCGGTTCAGTTGGAGCAGCAACATGATCZAGTCTTATTGGTCAACGGGACGACCACCGTA 

TTGATCAAATATACACACACACTCATAATCGTTGCTAGCTAGCTAACGATGAATCATGAG 

TTTAGTGGATATATATATGATTAAAAGAGGTTAGCTTATGAACATTAATAAGAGTTTGGA . 

TTCTATCGAGCTTCATTATGTTTGGGTCATCGTTC 

>G1070 Amino Acid Sequence (domain in AA coordinates: 98-120) 
MDPVQSHGSQSSIiPPPFHAKDFQLHLQQQQQEFFLHHHQQQRNQTDGDQQGGSGGNRQIK 
MDREETSDNIDNIANNSGSEGKDIDIHGGSGEGGGGSGGDHQMTRRPRGRPAGSKNKPKP 
PIIITRDSANALRTHVMEIGDGCDLVESVATFARRRQRGVCW 

PS PGS WSIjHGRFE I LSLSGS FLPPPAPPTATGLS VYLAGGQGQVVGGSVVGPIjLCAGPV 
VVMAASFSNAAYERIiPIiEEDEMQTPVHGGGGGGSLESPPMMGQQLQHQQQAMSGHQGLPP 
NliliGSVQLQQQHDQSYWSTGRPPY* 
>G1076 (198. .1076) 

ATTTTAGTCTTCCTATAACTTCTTCTCAATC 
TTTCAATAAAATAGAAAAAAACATATACAAAT^ 

CTTGTGTGTGTGTGTGTGTTTTATATAATTTTTA'riUT'rrri'CAAATTAAAATCTCTTCT 
TTGCTTTT6ATGTGGGCATGGCTGGTCTTGATCTAGGCACAGCT 

ACCAGCTCCATCGTCCCGATCTCC^CCTTC^CCAC^TTCCTCCTCCGATGACGTCACTC 
CCGGAGCCGGGATGGGTCATTTCACCGTCGACGACGAAGAC^^ 

GTCTTGACTTAGCCTCTGGTGGAGGATCAGGAAGCTCTGGAGGAGGAGGAGGTCACGGCG 
GGGGAGGAGATOTCGTTGGTCGTCGTCCACGTGGCAGACCACCGGGATCCAAGAACAAAC 
CGAAACCTCCGGTAATTATCACGCGCGAGAGCGCAAACACTC^ 

AAGTAACAAACGGCTGCGATGTTTTCGACTGCGTTGCGACTTATGCTCGTCGGAGACAGC 
GAGGGATCTGCGTTCTGAGCGGTAGCGGAACGGTCACGAACGTCAGCATACGTCAGCCAT 
CTGCGGCTGGAGCGGTTGTGACGCTACAAGGAACGTTCGAGATTCTTTCTCTCTCCGGAT 

CGTTTCTTCCTCCTCCGGCACCTCC^ 

GACAAGGTCAGGTGGTTGGAGGAAGCGTTGTGGGTGAGCTTACGGCGGCTGGACCGGTGA 

TTGTGATTGCAGCTTCGTTTACTAATGTTGCTTATGAGAGACTTCCTTTAGAAGAAGATG 

AGCAGCAGCAACAGCTTGGAGGAGGATCTAACGGCGGAGGTAATTTGTTTCCGGAGGTGG 

CAGCTGGAGGAGGAGGAGGACTTCCGTTCTTTAATTTACCGATGAATATGCAACCAAATG 

TGCAACTTCCGGTGGAAGGTTGGCCGGGGAATTCCGGTGGAAGAGGTCCTTTCTGATGTG 

TATATATTGATAATCATTATATATATACCGGCGGAGAAGCTTTTCCGGCGAAGAATTTGC 

GAGAGTGAAGAAAGGTTAGAAAAGCTTTTAATGGACTAATGAATTTCAAATTATCATCGT 

GATTTCGGACATTGTCTTGTTCATCATGTTAAGCTTAGGTTTATTTTTTGTCGTTTGTAG 

AATTTTATGTTTGAATCCTTTTTTTTTTCTGTGAAACTCTATTGTG 

AAAAAAAAATTCTCAAAAAAAA 

>G1076 Amino Aeid Sequence (domain in AA coordinates: 82-89) 
MAGIiDLGTAFRYViraQLHRPDLHIJn^ 

GGGSGSSGGGGGHGGGGDVVGRRPRGRPPGSKNKPKPPVIITRESAOTLRAHIIjEVTNGC 
DWDCVATYARRRQRGICVLSGSGTVTNVSIRQPSAAGAVVTLQGTFBIIjSLSGSFLPPP 
APPGATSLTI FIAGGQGQVVGGSVVGELTAAGPVI VI AAS FTNVAYERIjPIiEEDEQQQQL 
GGGSNGGGNIiFPEVAAGGGGGLPFFNLPMNMQPNVQLPVEGWPGNSGGRGPF* 
>G1089 (31.. 2427) 

AAGTAAGAGAGCTTCTTAAGGAAGAAGAAGATGGGTTGTGCTCAATCAAAGATCGAGAAC 
GAAGAAGCAGTTACTCGTTGCAAAGAACGAAAACAATTGATGAAAGACGCCGTCACTGCT 
CGTAACGCTTTCGCCGCCGCTCACTCAGCTTACGCTATGGCTCTTAAAAACACCGGAGCT 
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GCTCTTTCCGATTACTCTCACGGCGAGTTTTTAGTCTCTAATCACTCGTCTTCCTCCGCA 

GCTGCAGCAATCGCTTCTACTTCTTCTCTTCCCACTGCTATATCTCCTCCTCTTCCTTCT , 

TCCACCGCTCCGGTTTCTAATTCAACCGCTTCTTCTTCCTCCGCTGCGGTTCCTCAGCCG 

ATTCCTGATACTCTTCCTCCTCCTCCTCCTCCACCACCGCTTCCTCTTCAACGTGCTGCT 

ACTATGCCGGAGATGAACGGTAGATCCGGTGGTGGTCATGCTGGTAGTGGACTCAACGGA 

ATTGAAGAAGATGGAGCCCTAGATAACGATGATGATGACGATGATGATGATGATGACTCT 

GAAATGGAGAATCGTGATCGTTTGATTAGGAAATCGAGAAGCCGTGGAGGTAGTACTAGA 

GGAAATAGGACGACGATTGAAGATCATCATCTTCAGGAGGAGAAAGCTCCGCCACCTCCC 

CCTTTGGCGAATTCGCGGCCAATTCCGCCGCCACGTCAGCATCAGCATCAACAT 

CAGCAACAAC^CCTTTCTACGATTACTTCTTCCCTAATGTTGAGAATATGCCTGGAACT 

ACTTTAGAAGATACTCCTCCACAACCACAACCACAACCAACAAGGCCTGTGCCTCCTCAA 

CCACATTCACCAGTCGTTACTGAGGATGACGAAGATGAGGAGGAGGAAGAGGAGGAAGAG 

GAGGAGGAAGAGGAGACGGTGATTGAACGGAAACCACTGGTGGAGGAAAGACCGAAGAGA 

GTAGAGGAAGTGACGATTGAATTGGAAAAAGTTACTAATTTGAGAGGGATGAAGAAGAGT 

AAAGGGATAGGGATTCCCGGAGAGAGGAGAGGAATGCGAATGCCGGTGACTGCGACGCAT 

TTGGCGAATGTATTCATTGAGCTTGATGATAATTTCTTGAAAGCTTCTGAAAGTGCTCAT 

GATGTTTCTAAGATGCTTGAAGCTACTAGGCTCGATTACCATTCTAATTTTGCAGATAAC 

CGAGGACATATTGATCACTCTGCTAGAGTGATGCGTGTAATTACATGGAATAGATCATTT 

AGAGGAATACCAAATGCTGATGATGGGAAAGATGATGTTGATTTGGAAGAGAATGAAACT 

GATGCTACTGTTCTTGACAAATTGCTAGCATGGGAAAAGAAGCTCTATGACGAAGTCAAG 

GCTGGCGAACTGATGAAAATCGAGTACCAGAAAAAGGTTGCTGATTTAAATCGGGTGAAG 

AAACGAGGTGGCCACTCGGATT(^TTAGAGAGAGCTAAAGCAGCAGTAAGTC^TTTGCAT 

ACAAGATATATAGTTGATATGCAATCCATGGACTCCACAGTTTGAGAAATCAATCGTCTT 

AGGGATGAACAACTATACCTAAAGCTCGTTCACCTTGTTGAGGCGATGGGGAAGATGTGG 

GAAATGATGCAAATACAT(^TCAAAGACAAGCTGAGATCTQ\AAGGTGTTGAGATCTCTA 

GATGTTTCACAAGCGGTGAAAGAAACAAATGATCATCATGAC^ 

TTGGCAGTGGTTCAAGAATGGC^CACGCAGTITTGCAGGATGATAGATCATCAGAAAGAA 

TACATAAAAGCACTTGGCGGATGG CTAAAGCTAAATCTCATC CCTATCGAAAGCACACTC 

AAGGAGAAAGTATCTTCGCCTCCTCGAGTTCCCAATCCCGCAATCCAAAAACTCCTCCAC 

GCTTGGTATGACCGTTTAGAGAAAATCCCCGACGAAATGGCTAAAAGTGCCATAATCAAT 

TTCGCAGCGGTTGTAAGCACGATAATGCAGCAGCAAGAAGACGAGATAAGTCTCAGAAAC 

AAATGCGAAGAGAGAAGAAAAGAATTGGGAAGAAAAATTAGAGAGTTTGAGGA 

CACAAATACATCCAGAAGAGAGGACCGGAGGGGATGAATCCGGATGAAGCGGATAACGAT 

CATAATGATGAGGTCGCTGTGAGGCAATTCAATGTAGAACAAATTAAGAAGAGGTTGGAA 

GAAGAAGAAGAAGCTTACGATAGAC^AAGCCATCAAGTTAGAGAGAAGTCACTGGCTAGT 

CTTCGAACTCGCCTCCCCGAGCTTTTTCAGGCAATGTCCGAGGTTGCGTATTCATGTTCG 

GATATGTATAGAGCTATAACGTATGCGAGTAAGCGGCAAAGCCAAAGCGAACGGGATCAG 

AAACCTAGCCAGGGAGAGAGTTCGTAAGAACTAATGTAAGATCAGAGTAATGTCTTCTTC 

TTCTTTGATCTTGAATATTTAAGCACAGACATAC^ 

ATTGCTTTCTTATATTAAGGTTTTGGCTTTTGTAAGAAGGTTTCTTACATATGAGATTCA 
TATAGTGTTTGATTCTTAAGGAACTGTTCTGTTGAGTAATAAGAAAGTTGTGTATTGAAA 
TAGAGTTGCATTTGTTAATTTTG 

>G1089 Amino Acid Sequence (domain in AA coordinates 425-500) 
MGCAQSKIENEEAVTRCKERKQLMKDAVTA 

LVSNHSSSSAAAAIASTSSLPTAISPPLPSSTAPVSNSTASSSSAAVPQPIPDTLPPPPP 
PPPLPIjQRAATMPEMNGRSGGGHAGSGLNG I EEDGAIjDNDDDDDDDDDD S emenrdrl ir 
KSRSRGGSTRGNRTTIEDHHLQEEKAPPPPPIiANSRPIPPPRQHQHQHQQQQQQPFYDYF 
FPXTVEl^PGTTLED^PPQPQPQPTRPVPPQPHSPVVTEDD 
KPLVEERPKRVEEVTIELEKVTNLRGMKKSKGIGIPGERRGMRM 

NFLKASESAHDVSKMBEATRIjHYHSNFADNRGHIDHSARVI^VITWNRSFRG I PNADDGK 

DDVDLEENETHATVTiDKLLA^ 

RAKAAVSHLHTRYIVDMQSMDSTVSEIW^ 

AE I SKVLRSIiDVSQAVKETNDHHHERTI QLIiAVVQEWHTQFCRM IDHQKE Y I KALGGWLK 
LNLI PIESTLKEKVS SPPRVPNPAIQKLIiHAWYDRLDKIPDEMAKSAI INFAAWSTIMQ 
QQEDE I SLRNKCEETRKELGRKIRQFEDWYHKYIQKRGPEGMMPDEADNDHNDE VAVRQF 
NVEQIKKRLEEEEEAYHRQSHQVREKSI1ASI1RTRI1PELFQAMSEVAYSCSDMYRAITYAS 
KRQSQSERHQKPSQGQSS * 
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>G1093 (1..531) 

ATGGGTTATCCGGTGGGGTACACTGAGCTCCTCCTCCCAAGAATCTTCCTTCACTTACTC 

TCTCTCTTAGGCTTAATACGAAC^CTCATAGAC^CGGGTTTTCGGATATTGGGTCTACCC 

GACTTTCTCGAATCCGACCCGGTTTCATCGTCATCGTCATGGCTGGAACCACCGTATATG 

TCCACGGCGGCGCATCATCACCAAGAAAGCTCATTTTTCTTCCCAGTGGCGGCGAGGCTA 

GCTGGAGAAATCTTGCCCGTCATCAGATTCTCGGAGCTAACTCGACCCGGATTCGGATCC 

GGATCCGATTGCTGCGCGGTGTGCCTCCACGAGTTCGAGAACGATGACGAGATCCGACGG 

CTGACGAATTGTCAACACATATTTCACCGGAGCTGTTTAGACCGTTGGATGATGGGTTAT 

AATCAGATGACGTGTCCACTTTGTAGAACGCCGTTTATTTCTGATGAGTTACAAGTTGCT 

TTTAACCAACGAGTTTGGTCTGAATCTGAACTTCTCGCAGAATCAAATTAG 

>G1093 Amino Acid Sequence (domain in AA coordinates: 105-148) 

MGYPVGYTEIiLLPRIFLHLLSLLGLIRTIilDTGFRIIjGIjPDFIiESDPVSSSSSWLEPPYM 

STAAHHHQESSFFFPVAARLAGEILPVIRFSELTRPGFGSGSDCCAVCLHEFENDDEIRR 

LTNCQHI FHRSCIjDRWMMGYNQMTC PLCRT PFI SDEIjQVAFNQRVWSESEUiAESN * 

>G1127 (191.. 1351) 

GACAGACTCTCTCTGTATGTGTGCGAGAAGCGAGAAGCGAGAGAGAGAGAGAGAGAGTTG 
TTAGCTCACACGCTTTCTCTATTTT 

CGAGAATTAAGCCGAAAGAAACAATCTTTGAGTTTGATTTCTTCTTCCTTCCTTCTCTCT 
CTCTGCTCTAATGGATTCCAGAGACATCCCACCGTCACATAACCAGCTTCAACCACCACC 
GGGAATGTTAATGTCTCATTACCGTAACCCTAACGCCGCCGCTTCACCATTAATGGTTCC 
C^CTTCCACATCTCAACCGATTCAACACCCT 

TCAAACGTTTCATCAGCAGCAACAACAACAAATGGATCAGAAGACTCTTGAATCTCTTGG 
ATTTGGTGATGGATCA.CCTTCTTCTCAACCGATGCGATTCGGGATCGATGATCAGAATCA 
GCAACTGCAAGTGAAGAAGAAGCGAGGAAGGCCGAGAAAGTATACTCCTGATGGTAGCAT 
TGCTTTAGGTTTAGCTCCTACGTCTCCTCTTCTCTCTGCAGCTTCTAATTCTTACGGTGA 
GGGTGGTGTTGGAGATAGTGGTGGAAATGGAAACTCTGTTGATCCACCTGTTAAACGTAA 
CAGAGGAAGGCCTCCTGGTTCTAGTAAGAj?^CAGCTTGATGCTTTAGGAGGAACTTCAGG 
AGTTGGGTTTAC^CCTC^TGTCATTGAAGTGAACA(^GGAGAGGACATAGCGTCAAAGGT 
GATGGCTTTTTCGGATCAAGGGTCAAGAACAATTTGTATTCTCTCTGCAAGTGGTGCAGT 
TTCTAGAGTGATGCTTCGTCAAGCTTCTCATTCTAGTGGAATCGTTACTTATGAGGGACG 
ATTTGAGATCATTACTCTCTCAGGCTCAGTCTTGAATTATGAGGTAAATGGTTCCACCAA 
CAGAAGTGGTAACTTGAGTGTGGCTTTGGCTGGACCTGATGGCGGCATCGTAGGTGGCAG 
TGTAGTTGGTAATCTAGTAGCTGCAACACAAGTCCAGGTGATAGTGGGAAGCTTTGTTGC 
AGAAGCAAAGAAACCGAAACAAAGTAGTGTTAACATTGCTCGGGGGCAGAATCCTGAACC 
GGCTTC^GCGCCGGCTAACATGTTGAACTTTGGATCAGTCTCTCAAGGACCATCGAGCGA 
GTCATC^GAAGAGAATGAGAGCGGTTCTCCTGCAATGCIACCGTGACAATAATAATGGGAT 
ATATGGAGCTCAACAACAACAACAACAACAACC 

CCAACATCTTTGGTCTAATCATGGTGAATAAAATGAAGCGGAAATTAATTTGTTTCCGTT 
TTGGTTACGGTTATGGTTTGATTTCTT 

>G1127 Amino Acid Sequence (domain in AA coordinates : 103-110 , 155-162) 

MDSRDIPPSHNQLQPPPGMLMSHYRNPNAAASPLMVPTSTSQPIQHPRLPFGNQQQSQTF 

HQQQQQQMDQKTLESIiGFGDGSPSSQPMRFGIDDQNQQLQVKKKRGRPRKYTPDGSIAIjG 

IiAPTS PLLS AASNS YGEGGVGDSGGNGNSVDP PVKRNRGRPPGS S KKQLDAIjGGTSGVGF 

TPHVTEVNTGEDIASKVMAFSDQGSRTICILSASGAVSRVMl^ 

ITLSGSVLNYEVNGSTmSGNLSVALAGPDGGIVGGS 

KPKQSSVNIARGQNPEPASAPANMIjNFGSVSQGPSSESSEENESGSPAMHRDNNNGIYGA 
QQQQQQQPLHPHQMQMYQHLWSNHGQ * 
>G1131 (57. .756) 

TCX^CTCCTCTCCTGATTGCTTCACCTTCTTCTTTACTACAGG 

CCATGGATTGCTTAAGCTACTTCTTTAACTACGATCCTCCTGTCCAGCTCCAGGATTGCT 
TTATTCCCGAGATGGATATGATTATCCCTGAAACCGATAGTTTCTTCTTCCAATCTCAAC 
CGCAACTGGAGTTTCATCAGCCATTGT 

ACCCTTTCTGCGACCAGTTTCTTTCTCCGCAAGAAATCTTTCTCCCTAACCCTAAAAACG 
AAATCTTCAACGAAACACACGACCTCGATTTCTTTCTCCCCACGCCAAAACGCCAGAGAC 
TTGTTAACTCCAGCTACAATTGTAACACTCAAAACCATTTCCAGAGCCGTAACCCGAATT 
TCTTCGACCCTTTCGGCGACACTGATTTCGTTCCAGAATCTTGTACCTTCCAGGAGTTTC 
GAGTTCCGGATTTCTCTTTAGCTTTCAAGGTAGGCCGGGGAGATCAAGATGACTCAAAGA 
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AACCGACGCTTTCATCTCAGAGCATCGCGGCTAGAGGGAGGAGAAGAAGAATTGCAiSAGA 

AGACTCACGAGCTCGGAAAACTCATCCCCGGTGGCAATAAACTTAACACCGCCGAGATGT 

TCCAAGCCGCCGCTAAGTATGTCAAGTTTTTGCAGAGTCAAGTTGGGATTCTCCAAC 

TG CAGACCACAAAGAAGGTAATAACCAAC CCCAAATAAGAACTTTATCATC CAATTG AAA 

CTCTAATCGTGTTTTCTCACAAGCTTCTTAATTTGTTTACGCAGGGTAGCTCTAATGTGC 

AAATGGAAACTCAGTATTTGCTTGAATCGCAAGCAATCCAGGAGAAGTTATCAACAGAGG 

AAGTGTGTTTGGTACCGTGTGAAATGGTTCAAGATCTAACAACTGAAGAAACCATTTGCA 

GAACCCCGAATATTTCTCGAGAAATCAACAAGTTACTGTCTAAACATCTGGCTAACTAGT 

TTTAGTTTCAAGCCTGAAGTTCTCTATGCCTAAATTTGTGTCTGTTATCGTTGTTTTGTC 

TTCTTAGTTAGTGTTTTGTCTTGTTGATTTAGGGGCTAATTATCCTGGTTAATCTCCTCT 

TAACTGGGAA 

>G1131 Amino Acid Sequence (domain in AA coordinates: 173-220) 

MSMDCLSYFFNYDPPVQLQDCFIPEMDMIIPETDSFFFQSQPQLEFHQPLFQEEAPSQTH 

FDPFCDQFLSPQEIFLPNPKNEIFNETHDLDFFIjPT^ 

NFFDPFGDTDFVPESCTFQEFRVPDFSIiAFKVGRGDQDDSKKPTIjSSQSIAARGRRRRIA 
EKTHELGKLI PGGNKLNTAEMFQAAAKWKFIjQSQVG I LQLMQTTKKVTTNPK* 
>G1145 (243.. 1142) 

GTGATTTCTCTCTGCCATTTCCTTCGATTTGATTTCTGGGTTCTCTTCTTCTCGTCTCTC 

TTCTGCATGTTTCGCGACTCTACCTTAGAT^AAAAGGTTACTTTCGCCTCCGATTTAGGCT 

CGATTTGATGAATTCGTCGTCGTGTGGCTATTTATCAAATTGAGCATTAGGGTTTCTGAT 

TTGTGGGTTCAGAATTGTTTTTATCTATCTGTCTTGTTGTTTTTTGTCCGCTAGAAAA 

CTATGGATTCTCAGAGGGGTATTGTTGAACAAGCTAAATCTCAGTCCTTGAATAGGCAAA 

GCTCTCTTTACAGCTTAACACTTGATGAGGTTCAAAATCACTTGGGGAGTTCTGGTAAAG 

CTCTGGGAAGCATGAACCTTGATGAGCTTTTGAAGAGTGTCTGTTCTGTTGAAGCTAATC 

AGCCATCGTCTATGGCTGTCAATGGTGGAGCAGCTGCTCAGGAGGGTCTTTCTCGCCAGG 

GGAGTTTGACTTTGCCTCGGGATCTCAGCAAAAAGACTGTTGATGAGGTTTGGAAAGACA 

TTCAGCAGAATAAGAATGGAGGTAGTGCTC^TGAGAGGAGGGATAAGCAGCCTACACTTG 

GGGAAATGACGCTTGAAGACCTGTTGTTGAAAGCAGGAGTGGTCACTGAGACTATCCCTG 

GTTCGAACCATGATGGTCCTGTTGGTGGTGGTAGTGCTGGTTCAGGTGCTGGTTTAGGGC 

AAAACATTACTCAAGTTGGCCCATGGATTCAATAT^ 

CTCAAGCATTTATGCCCTATCCGGTTTCAGATATGCAAGCAATGGTGTCTCAGTCTTCTT 

TGATGGGTGGTTTGTCAGATACACAAACTCCTGGAAGGAAGAGGGTAGCTTCAGGAGAAG 

TTGTAGAGAAGACTGTAGAGAGGAGGCAGAAGAGAATGATAAAGAACAGAGAGTCTGCTG 

CTCGTTCCCGAGCTAGGAAACAGGCTTACACTCATGAGCTAGAGATCAAAGTTTCACGGT 

TAGAAGAAGAAAACGAAAGACTCAGGAAGCAAAAGGAGGTGGAAAAATCCTCCCAAGTGT 

ACCACCGCCTGATCCCAAGCGGCAGCTCCGACGGACAAGCTCGGCTCCTTTCTGATCTCT 

AAACTCTTTTTGTCTTTTTCTTTTTTTCTCTTCTGTGTCGGTTCACTTAT 

GGAAAACAGCTTTGTTTCTTTGTACATTCCGTAGACTTTCTTGACTTGGAGCAATTCTGT 

TAACTTTAAAATATTCTCGAGTTATTGTAGTAGCAGACTAGCAGCAGTAATGGTTTTCAT 

GAGTCCGATTGAAATTCAGAGATTGAACAGGAAAAAA 

>G1145 Amino Acid Sequence (conserved .domain in AA coordinates : 227-270) 

MDSQRGIvTSQAKSQSLNRQSSLYSLTIJDEVQ 

PSSMAWGGAAAQEGLSRQGSLTLPRDLSKKTVDEW 

EMTLEDIiIJiKAGWTETIPGSiraDGPVGGGSAGSGAGIXSQNITQVGPWIQYHQLPSMPQP 
QAFMPYPVSDMQAMVSQSSLMGGLSDTQTPGRK^^ 

RSRARKQAYTHELEIKVSRLEEENERLRKQKEVEKSSQVYHRLIPSGSSDGQARLLSDIj 
>G1229 (123.. 1217) 

TTTGGGCGGGTCTTTCTTTCCCTAAATCTTTCTTTTATTTTGCTGTTTAAAAAAAAAATC 

CAACCATAAGACAAAACAACGAACGAGGAAGAGAGAGAGAGAAGGATATATCTCTAATCA 

CGATGCAGGAGATAATACCGGATTTTCITGAAGAGTGTGAATTTGTCGACZACTTCACTAG 

CCGGAGATGATCTATTTGCCATCTTAGAGAGTCTTGAAGGTGCCGGAGAGATATCTCCGA 

CAGCTGC^TCTACACCTAAAGATGGAACCACAAGTTCCAAGGAGTTAGTTAAGGA 

ATTATGAAAACTGATCTCCTAAGAGGAAAAAGCAAAGAC 

ACGAAGAAGAAGAAGACGGAGACGGAGAAGCAGAAGAAGATAATAAGCAAGATGGGCAAC 
AAAAGATGTCTCATGTAACCGTGGAACGTAACCGGAGAAAGCAAATGAACGAGCACTTA^ 
CCGTTTTGCGTTCTCTTATGCCTTGTTTCTACGTCAAACGGGGGGACCAAGCATCGATCA 
TAGGAGGAGTTGTGGAGTACATAAGCGAGTTACAACAAGTTCTCCAATCTTTGGAAGCCA 
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AGAAACAACGTAAAACCTACGCCGAAGTCCTAAGCCCGAGAGTTGTCCCGAGCCCTCGTC 
CTTCACCGCCTGTTCTAAGCCCAAGAAAACCGCCTCTTAGCCCGCGCATCAACCACCACC 
AGATTCACCACCACCTACTTCTCCCTCCCATAAGTCCTCGAACACCTCAGCCAACAAGCC 
C^TACCGGGCCATTC(^CCGC^CTACCACTCATCCCA(^GCCTCCGCTTCGCTCTTAC^ 
GCTCATTGGCCAGTTGCAGCAGCTTAGGAGATCCACCTCCATACTCTCCTGCTTCATCTT 
CTTCATCTCCTTCAGTTAGTAGTAACCATGAGAGTAGTGTGATCAATGAGCTTGTTGCTA 
ACTCAAAATCGGCTTTGGCTGATGTGGAAGTGAAGTTTTCAGGAGCTAACGTGCTGCTCA 
AAACGGTGTCGCATAAGATCCCGGGACAAGTTATGAAGATAATTGCTGCTCTTGAAGATT 
TGGCTCTTGAGATTCTTCAGGTTAATATTAACACCGTCGACGAAACCATGCTTAATTC^ 
TCACCATGAAGATTGGAATTGAGTGCCAACTAAGTGCAGAAGAACTGGCTCAACAAATTC 
AGCAAACATTCTGCTAGTAAAGAAGGATTTAATATAGCTTCGTATAAACCTTAACGAGAG 
AGCAGTACGTACTCACTTTCTCTCCTTAGTATCCCTTTAATTATCTTTTCAGTTTTCTGC 
AAAGATATGGAGTTTAAAAAAATAAAATTGTTATCTAAAGTTTTAATCAAATATTGATTA 
ATTATAACTAATATAGGTATAAGTGAGTTTTAAAGATTATCAGCTTCATAACAGCCATCG 
TCATGTTTACTTTCTTTTAAATrTTAGAATTTAGACGTACTCCTACCATGTAATTTTATT 
TCTGTCATTACATCAAGCATTGTAGCTGTAATTGCATATGAATGAACAATAGTGTATGAG 
TGATCTCATGAATAATATTCTTCTTGCAACACAAAAAAAAAAAA 

>G1229 Amino Acid Sequence (domain in AA coordinates: 102-160) 
MQEXIPDFIiEECTFVDTSLAGDDLFAILBSLBGAGEISPTAASTPKDGTTSSKELVKDQD 
YENS S PKRKKQRLETRKEEDE EEEDGDGEAEEDNKQD^ 

VLRSLMPCFYVKRGDQAS I IGGWEYI SELQQVLQSLEAKKQRKTYAEVLSPRVVPSPRP 

SPPVLSPRKPPLSPRIWHHQIHHHLIiLPPISPRTPQPTSPYRAIPPQLPIiIPQPPLRSYS 

SLASCSSLGDPPPYSPASSSSSPSVSSNHESSVINELVANSKSAIiADVEVKFSGANVLLK 

TVSHKIPGQVMKIIAALEDIiALEILQVNINTW 

QTFC* 

>G1246 (1..1746) 

ATGATCATGTACGGAGGAGGAGGAGCAGGGAAGGACGGTGGATCCACCAATCACTTATCA 

GACGGAGGAGTGATATTGAAGA^AGGTCCATGGACGGCGGCGGAAGATGAGATACTTGCT 

GCGTACGTTAGAGAGAACGGTGAAGGGAATTGGAACGCCGTTCAGAAAAACACAGGTTTG 

GCTCGTTGCGGCAAAAGCTGCCGTCTTCGATGGGCCAATCACCTCCGACCAAATCTGAAA 

AAAGGCTCTTTCACCGGTGACGAAGAACGTCTCATCATTCAGCrTCZATGCTCA^ 

AACAAATGGGCTCGCATGGCTGCTCAGTTACCGGGAAGAACAGACAACGAGATTAAGAAC 

TATTGGAA(^CGAGATTGA7VACGACTTCTTCGCCAAGGACTTCCTCTTTATCCTCCAGAT 

ATTATCCCTAACCATCAACTCCATCCACATCC^ 

CATCATCATCATCATCATCAACAACA^ 

TCTTCACAACGAAACACACCATCATCTTCC 

AAGTCCTCATGATCCTTCACTTTTCATACCACGACTGCTAACCTCCTCCATCCACTTAGC 
CCTC^C^CTCC^AACAC^CCATCTC^^ 

TCTCCTTTATGTTCCCCTCGCAAC^CCAATACCCGACCCTTCCCCTCTTTGCCCTCCCG 
CGTTCCCAAATCAACAACAACAACAACGGAAATTTCACTTTCCCTAGACCTCC^ 
CTTCAACCGCCTTCATCACTCTTCGCAAAACGTTACAACAATGCTAACACTCCTCTTAAT 
TGCATCAACCGCGTCTC^CCGCACCATTTTCCCCTGTTTCAAGAGACTCCTACACTTCC 
. TTTCTTAC^TTGCCTTACCCTTCCCCAACCGCTCAAACCGCTACTTACCACAATACTAAT 
AACCCTTACTCTTCCTCTCCTTCCTTCTCTTTAAACCCTTCTTCTTCTTCTTACCCTACA 
TGAACTTCTTCCCCAAGCTTTCTTCACTCCCATTACACTCCTTCTTCCACCTCATTTCAT 
ACCAACCCAGTrTACTCCATGAAACAAGAGCAGCTCCCTTCAAACCAAATTCCCCAAATA 
GATGGCTTCAATAACGTCT^CAACTTCACAGACAACGAGAGACAGAATCATAACCTTAAC 
AGTTCCGGTGCTCA5AGAAGAAGTAGTAGCTGCAGCCTCTTAGAGGATGTCTTCGAAGAG 
GCCGAAGCTTTAGCCTCTGGAGGCAGAGGCCGACCTCCAAAACGAAGACAACTCACAGCT 
TCTCTTCCGAACC^CAACAACAA(^C(3^^^ 

GGACATTATGATTCTTOTGACAACTTATGTTCCTTGCAAGATTTGAAATCAAAGGAAGAA 

GAGTCTCTTCAAATGAACACAATGCAGGAGGACATAGCTAAGCTTCTTGATTGGGGAAGT 

GATAGTGGAGAGATCTCTAATGGACAATCATCTGTTGTCACTGACGACAATCTTGTTCT 

GATGTTCATCAATTAGCTTCACTATTCCCGGCTGATTCTACAGCCGTCGTAGCCGCAACA 

AACGACCAACACAACAAGAATAATAACAATAATTGTTCCTGGGATGACATGCAGGGAATA 

AGGTAG 

>G1246 Amino Acid Sequence (domain in AA coordinates: 27-139) 
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MIMYGGGGAGKDGGSTNHLSDGGVIL.KKGPWTAAEDEILAAYVRENGEGNWNAVQKNTGL 
ARCGKSCRIjRWANHIjRPNL 

YWWTRLKRIjLRQGLPLYPPDI I PNHQIiHPHPHHQQQQQHNHHHHHHQQQQQHQQMY FQPQ 
SSQRNTPSSSPLPSPTPANAKSSSSFTFHTTTANLIiHPLSPHTPNTPSQLSSTPPPPPLS 
SPLCSPRNNQYPTLPLFALPRSQINNNITOGN^ 

CIimVSTAPFSPVSRDSYTSFIiTIiPYPSPTAQTATYHNTNNPYSSSPSFSLNPSSSSYPT 

STSSPSFIiHSHYTPSSTSFHTNPVYSMKQEQLPSNQIPQIDGFNNVNNFTDlffi 

SSGAHRRSSSCSLIiEDVFEEAEALASGGRGRPPKRRQLTASLPN^ 

GHYTDSSDNLCSLQDLKSKEEESLQMNTMQEDIAKLIiD^ 

DVHQLASIiFPADSTAWAATOTQHNK3^mNNCSWD 

>G1255 (138. .1388) 

CAGCTC^UU^CTCTCTAGGACTACACTAAATC 

TAATTGAGATTGATCTGAAAACCAAAGCTCTCGTGCTCTTGTCGTTGATGTTGGTTGTGT 

AGACTTTGTATACAATGATGAAAAGTTTGGCGAATGCTGTTGGAGCGAAGACGGCGAGGG 

CTTGCGACAGCTGCGTGAAGAGACGTGCACGGTGGTACTGCGCGGCCGACGATGCTTTTC 

TTTGCC^GTCTTGCGACAGTTTGGTCC^TTCAGCAAACCCTCTTGCTCGCCG 

GAGTCCGTTTGAAGACGGCTAGCCCGGCGGTCGTAAAGCATAGCAACGACTCATCAGCTT 

CTCCTC(^C^TGAGGTCGCC^CGTGGCATCACGGGTTTACTCGTAAAGCTCGAACGCCAC 

GTGGCTCTGGTAAGAAAAACAATTCGTCGATATTTCATGACTTGGTTCCTGATATTAGTA 

TTGAGGATCAGACAGACAACTATGAGCTTGAAGAGCAGCTGATCTGTCAAGTGCCGGTTC 

TAGATCCGTTGGTGTCTGAGCAGTTCTTGAACGATGTCGTTGAGCCCAAGATCGAGTTTC 

CTATGATCAGAAGTGGTTTGATGATCGAGGAGGAGGAAGACAACGCTGAAAGTTGTCTTA 

ATGGATTTTTCCCGACCGACATGGAGCTTGAGGAGTTTGCTGCTGACGTGGAGACTCTGC 

TCGGTCGCGGGTTAGACACGGAGTCGTATGCCATGGAGGAGCTAGGGTTATCTAATTCAG 

AGATGTTCAAAATCGAAAAAGATGAGATTGAAGAAGAAGTAGAAGAGATAAAAGCCATGA 

GCATGGATATATTTGATGATGATCGAAAAGACGTGGATGGAACAGTACCGTTTGAGCTAA 

GCTTTGATTACGAGTCGTCACACAAGACGTCCGAAGAAGAGGTAATGAAGAACGTTGAAA 

GTAGTGGTGAATGTGTTGTTAAGGTGAAAGAGGAAGAACATAAGAATGTTCTGATGCTAA 

GATTAAACTATGACTCGGTGATATCCACTTGGGGAGGTCAAGGTCCACCGTGGAGTTCAG 

GAGAGCCACCGGAACGAGACATGGAC^TC^GCGGTTGGCCAGCCTTTTCCATGGTGGAGA 

ATGGAGGAGAAAGTACTCATCAGAAGCAATACGTTGGTGGATGTTTACCATCAAGTGGGT 

TTGGAGATGGAGGTAGAGAAGCTAGAGTTTCGAGATACAGAGAGAAGAGGAGGACAAGGT 

TGTTTTCTAAGAAGATACGGTACGAGGTACGTAAATTGAATGCAGAGAAAAGACCACGAA 

TGAAAGGAAGATTCGTGAAGAGAGCCTCGCTCGCTGCTGCTGCTTCACCATTAGGTGTTA 

ATTACTGAATAGTTAATATCTATTCATGTTATATCTCACl^TACAAATTTCGGTGAATCT 

TTTTTCTTCTGAAACAACAGAAGTTATTTTGGCACTTAATTGTGCTTTGAGGACTTGTAT 

GTACATAGAAGTAACCAATAATAATGTGACTTTTACTA 

>G1255 Amino Acid Sequence (domain in aa coordinates: 18-56) 
MKS LANAVGAKTARACD S CVKRRARWYCAADDAFIjCQ S CD S I/VHS ANPLARRHERVRLKT 
AS PAWKHSNHS S AS PPHEVATWHHGFTRKARTPRGSGKKNNS S I FHDIjVPD I S IEDQTD 
NYELEEQLICQVPVLDPIiVSEQFIiiroVVE 

DMELEEFAADVETLLGRGLDTBSYAMEELGLSNSEMFK1EKDEIEEEVEEIKAMSMDIFD 
DDRKDVDGl^FELSFDYESSHKTSEEEVMKt^^ 

VISTWGGQGPPWSSGEPPERDMDISGWPAFSMVENGGESTHQKQYVGGCLPSSGFGDGGR 
EAR VS R YRE KRRTRL FS KKI R YE VRKLNAEKRPRMKGRF VKRAS LAAAAS PLG VNY * 
>G1304 (1..978) 

ATGGGGCGATCACCATGTTGCGATGAGAATGGTCTAAAGAAAGGGCCATGGACACAAGAG 

GAGGATGATAAACT^ATAGATCACATTCAAAAACATGGCCATGGCAGCTGGAGAGCTCTT 

CCAAAGCAAGCCGGTTTAAACCGATGCGGAAAGAGTTGTAGATTAAGATGGACCAACTAC 

TTGAGACCTGACATCAAGAGAGGAAATTTCACTGAAGAGGAAGAACAAACTATTATCAAC 

CTCCATTCCCTTCTTGGAAACAAGTGGTCGTCGATAGCCGGTAATCTTCCTGGAAGAACG 

GACAATGAAATAAAAAACTATTGGAACACACATTTGAGAAAGAAACTTCTCGAAATGGG^ 

ATTGATCCGGTGACCCATAGGCCAAGAACCGACGATCTAAACGTTTTAGCAGCTCTCCCG 

CAGCTTATAGCCGCCGCAAATTTCAACAGCCTCT^ 

GATGCAACAACTCTTGCTAAAGCTCAACTGCTAC&(^ 

AATAACAACACCACCAATCCTTCTTTTTCOT 

CTCTTTGGCCAAGCTTCTTACTTAGAGAACCAAAATCTTT 
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TCTCACATTCTTGAGGATGAGAAT 

GACTCTTTTTCTTCCCCCATACAACCCGGTTTTCAAGATGATCATAATTCACTCCCTCTA 
TTGGTTCCGGCGTCTCCTGAAGAATCTAAAGAAACTCAAAGGATGATCAAGAACAAAGAC 
ATCGTCGATTACCATCATCATGATGC^TCAAACCCTTCATCATCAAACTCAACGTTTACA 
CAAGATCATCATCACCCATGGTGTGACACTATTGATGATGGAGCAAGTGATTCTTTTTGG 

AAAGAGATAATAGAGTAA 

>G1304 Amino Acid Sequence (conserved domain in AA coordinates : 13-118) 
MGRSPCCDENGLKKGPWTQEEDDKLIDHIQKHGHGSVniAIjPKQAGLNRCGKSCRLRWTNY 
LRPDIK^GNFTEEEEQTIINIjHSLLGNKWSSIAGNLPGRTDNEIKN^ 
IDPVTHRPRTDHLISnnjAAk^ 

NNNTTNTPSFSSSTMQNSNTNIjFGQASYIiENQI^FGQS 

DSFSSPIQPGFQDDHNSLPLLVPASPEESKETQRMIKNKDIVI)YHHHDASNPSSSNSTFT 
QDHHHPWCDTIDDGASDS FWKE I IE * 
>G1318 (7.. 849) 

AAAAATATGAGGAAGCCAGAGGTAGCCATTGCAGCTAGTACTCACCAAGTAAAGAAGATG 

AAGAAGGGACTTTGGTCTCCTGAGGAAGACTCAAAGCTGATGCAATACATGTTAAGCAAT 

GGACAAGGATGTTGGAGTGATGTTGCGAAAAACGCAGGACTTCAAAGATGTGGCAAAA 

TGCCGTCTTCGTTGGATCAACTATCTTCGTCCTGACCTCAAGCGTGGCGCTTTCTCTCCT 

C^GAAGAGGATCTC^TCATTCGCTTTC^TTC<^TCCTCGGCAACAGGTGGTCTCAGACT 

GCAGCACGATTGCCTGGTCGGACCGATAACGAGATCAAGAATTTCTGGAACTCAACAATA 

AAGAAAAGGCTAAAGAAGATGTCCGATACCTCCAACTTAATCAACAACTCATCCTCATCA 

CCCAACACAGCAAGCGATTCCTCTTCTAATTCCGCATCTTCTTTGGATATTAAAGACAT 

ATAGGAAGCTTCATGTCCTTACAAGAACAAGGCTTCGTCAACCCTTCCT 

CAAACCAACAATCCATTTCCAACGGGAAACATGATCAGCCACCCGTGCAAT 

ACCCCTTATGTAGATGGTATCTATGGAGTAAACGCAGGGGTACAAGGGGAACTCTACTTC 

CCACCTTTGGAATGTGAAGAAGGTGATTGGTACAATGCAAATATAAACAACCACTTAGAC 

GAGTTGAACACTAATGGATCCGGAAACGCACCTGAGGGTATGAGACCAGTGGAAGAATTT 

TGGGACCTTGACCAGTTGATGAACACTGAGGTTCCTTCGTTTTACTTCAACTTCAAACAA 

AGCATATGAATATTTTTACGTCATCTTATTCTTTTTTCTATTGCGGTTTATACTCAAG 

TCTTAGCCACACACACATAAATGCAAATATATATACATTGTTAGAGAGTATTTTGTATTT 

CGTATAATCTTTTCGTACTAGGGCTTGAGCCTTGAGGTCCCATGTAACGATTAGTCAATG 

TAAAACATATATCCTATAATAAATAAATAAAAGAAATAATAAGCACATAAAAAAAAAAAA 

A 

>G1318 Amino Acid Sequence (domain in AA coordinates: 20-123) 
MRKPEVAIAASTHQVKKMKKGLWS PEEDSKLMQYMLSNGQGCWSDVAKNAGIjQRCGKS CR 
LRWINYJjRPDIiKRGAFS PQEEDLI IRFHS ILGNRWSQIAARJjPGRTDNEIKNFWNSTIKK 
RLKKMSDTSNLIl^SSSSPNTASDSSSNSASSL^ 
l^PFPTGNMISHPCNDDFTPYVDGIYGTOAGV^ 
NTNGSGNAPEGMRPVEEFWDLDQLMNTEVPSFYFNFKQS I * 
>G1320 (39.. 788) 

GAAGATCATAAAGATCAAAAGGAGAGAGGTATTAAAAAATGATGTGTAGTCGAGGCCATT 
GGAGACCTGCAGAAGACGAGAAGCTAAGAGAACTCGTCGAGCAATTTGGTCCTCATAATT 
GGAACGCCATAGCTCAGAAGCTCTCTGGTCGATCTGGTAAGAGTTGTAGATTGAGATGGT 
TTAATCAATTGGATCCTAGGATTAACCGAAACCCTTTCACGGAGGAAGAAGAAGAAAGGC 
TTTTAGCGCCTCATCGGATCCATGGGAACAGATGGTCTGTGATCGCTAGATTTTTTCCCG 
GTCGAACTGATAACGCTGTTAAAAACCATTGGCACGTCATCATGGCTCGTCGTGGCCGAG 
AACGGTCCAAGCTCCGTCCACGAGGCCTTGGCCATGATGGCACGGTGGCTGCGACTGGGA 
TGATTGGTAATTAT-AAAGACTGCGATAAGGAGAGAAGATTGGCAACCACAACCGCTATCA 
ATTTTCCTTATCAATTCTCTCATATTAATCATTTTCAAGTCCTCAAAGAGTCCTTGACCG 
GAAAGATCGGGTTCAGAAATAGTACTACTCCAATACAAGAAGGAGCAATAGACCAAACTA 
AACGACCGATGGAGTTCTACAATTTTCTCCAAGTAAACACGGATTCGAAGATACACGAAT 
TGATAGATAATTCAAGAAAAGACGAAGAAGAAGATGTCGATCAAAACAACCGAATTCGTA 
ACGAGAATTGTGTTCCATTTTTCGACTTTTTGTCTGTTGGAAACTCTGCCTCTCAGGGTT 
TATGTTAATTTGTCCGTACCACATGTACTATAAGGTGGACCATATGTTTU^CTAAAGATAA 
TGTAGAAAGTACTAATC^VATTAGAGCTCCTGTTTGAGCCAAATGTGAAAATTAGTTAAGA 
CATCCCAAACATTTTCTTGTATAACACA 

CTATTTTTATTTTAAGGATGTTTAATCAGACCCATAACCATTCGATAAAAAAAAAAAAAA 

/ 
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>G1320 Amino Acid Sequence (domain in AA coordinates: 5-108) 

MMCSRGHWRPAEDEKLRELVEQFGPHNWNAIAQKLSGRSGKSCRLRW 

TEEEEERLLAPHRIHGNRWSVIARFFPGRTDNAVKNHV^ 

GTVAATGMIGNYKDCDKERRIiATTTAINFPYQFSHIlfflFQVIiKESLTGKIGFRNSTTPIQ 

EGAIDQTKRPMEFYNFLQVNTDSKIHELIDNSRKDEEED^ 

GNSASQGLC* 

>G1330 (36.. 959) 

GTACCGGCGACCTCTTTGTGGGTCACTCTTCATCAATGGGTGACAAAGGAA.GGAGCTTAA 
AGATCAACAAGAACATGGAGGAATTCACGAAAGTGGAAGAAGAAATGGACGTAAGGAGAG 
GTCCATGGACAGTTGAGGAAGATTTAGAGCTCATCAATTACATTGCTAGTCATGGTGAAG 
GTCGATGGAACTCTCTCGCTCGTTGCGCCGAACTCAAAAGGACCGGAAAAAGCTGCAGAC 
TTCGGTGGCTGAACTATCTCCGACCAGATGTGCGCCGTGGAAACATAACCCTCGAAGAAC 
AACTCTTGATTCTTGAACTTCACACACGTTGGGGCAATAGATGGTCTAAGATTGCACAAT 
ATTTACCAGGAAGAACGGATAACGAGATCAAAAACTATTGGAGAACACGTGTTCAAAAGC 
ATGGAAAACAGCTTAAATGCGACGTGAACAGTCAACAATTTAAAGACACCATGAAGTATC 
TTTGGATGCCTCGGCTCGTAGAAAGGATCCAAGCCGCGTCCATCGGGTCTGTTTCCATGT 
CATCTTGCGTCACCACCTCCTCAGATCAGT^ 

TGGATAATTTGGCTTTAATGAGTAACCCTAATGGTTACATCACGCCGGATAATTCCAGCG 

TGGCAGTATCTCCTGTATCAGATTTGACGGAGTGTCAAGTGAGTAGTGAAGTGTGGAAGA 

TTGGTCAGGATGAGAATTTGGTGGATCC7VAAAATGACATCGCCGAATTATATGGATAATA 

GCAGTGGACTATTAAACGGAGATTTTACGAAGATGCAAGATCAAAGTGACCTTAATTGGT 

TTGAAAATATTAATGGGATGGTACCAAATTATTCGGACAGTTTTTGGAACATTGGAAATG 

ATGAAGACTTCTGGCTCTTACAACAACATCAA 

TAGACAAGAAGCTATGCGGCC 

>G1330 Amino Acid Sequence (domain in AA coordinates: 28-134) 
MGDKGRSLKINK^EEFTKVEEEIVnDA^^ 

KRTGKSCRLRWLNYLRPDVRRGNITLEEQIjLIIjELHTR I KN 

YWRTRVQKHAKQLKCDWSQQFKD 

rNl^l^TNlWDlJmALMSNPNGYITPDNSSVAVSPVSD 

TS PNYMDNSSGjLIiNGDFTKMQDQSDIjNWFENINGMVPNYSDS FWNI GNDEDFWLIjQQHQQ 

VHDNGSF* 

>G1352 (79.. 900) 

GCGCGATTAAAAACTCTCAACTTTTCTCTCAAATTTCTGATCCTTTGATCCAACAGTTAG 

AAGAAGATTCATCTGATGATGGCCCTCGAAGCGATGAACACTCCAACTTCTTCTTTCACC 

AGAATCGAAACGAAAGAAGATTTGATGAACGACGCCGTTTTCATTGAGCCGTGGCTTAAA 

CGCAAACGCTCGAAACGTCAGCGTTCTCACAGCCCTTCTTCGTCTTCTTCCTCACCGCCT 

CGATCTCGACCCAAATCCCAGAATCAAGATCTTACGGAAGAAGAGTATCTCGCTCTTTGT 

CTCCTCATGCTCGCTAAAGATCAACCGTCGCAAACGCGATTTCATCAAGAGT 

TTAACGCCGCCGCCAGAATCAAAGAACCTTCCGTACAAGTGTAACGTCTGTGAAAAAGCG 

TTTCCTTCCTATCAGGCTTTAGGCGGTCAGAAAGCAAGTCACCGAATCAAACCACCAA 

GTAATCTCAACAACCGCCGATGATTCAACAGCTCCGACCATCTCCATCGTCGCCGGAGAA 

AAACATCCGATTGCTGCCTCCGGAAAGATCCACGAGTGTTCAATCTGTCATAAAGTGTTT 

CCGACGGGTCAAGCTTTAGGCGGTCACAAACGTTGTCACTACGAAGGCAACCTCGGCGGC 

GGAGGAGGAGGAGGAAGCAAATCAATCAGTCACAGTGGAAGCGTGTCGAGCACGGTATCG 

GAAGAAAGGAGCCACCGTGGATTCATCGATCTAAACCTACCGGCGTTACCTGAACTCAGC 

CTTCATCACAATCCAATCGTCGACGAAGAGATCTTGAGTCCGTTGACCGGTAAAAAACCG 

CTTTTGTTGACCGATCACGACCAAGTCATGAAGAAAGAAGATTTATCTTTAAAAATCTAA 

TACTCGACTATTAACTCTTGTGTGATTTTT^ 

TTTTAGTTAC AAATTTTT AATTGTTC TGATTTGGATTGAAA 

>G1352 Amino Acid Sequence (domain in AA coordinates: 108-129,167-188) 

MALEAMNTPTSSFTRIETKEDLI^^ 

QNQDLTEEEYLAIjCLLMLAKDQPSQTRFHQQSQ 

LGGHKASHRIKPPTVISTTADDSTAPTISIVAGEKHPIAASGKIHECSICHKVFPTGQAL 
GGHKKCHYEGNIiGGGGGGGSKSISHSGSVSSTVSEERSHRGFIDLNLPAIiPELSLHHNPI 
VDEEHjSPLTGKKPLLLTDHDQVIKKEDLSLKI * 
>G1354 (1..1047) 

ATGGAAAGTCTCGCACACATTCCTCCCGGTTATCGATTCCATCCGACCGATGAAGAACTC 
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GTTGACTATTATCTCAAGAACAAAGTTGCATTCCCGGGAATGCAAGTTGATGTTATCAAA 
GATGTTGATCTCTACAAAATCGAGCCA1K3GGACATCCAAGAGTTATGTGGAAGAGGGACA 
GGAGAAGAGAGG GAATGGTATTTCTTTAGC CACAAGGACAAGAAATATCCAACTGGG ACA 
CGAACCAATAGAGC^CGGGCTCCGGATTTTGGAAAGCAACGGGTCGAGACAAGGCCATT 
TACTCAAAGCAAGAGCTTGTTGGGATGAGGAAGACTCTTGTCTTTTACAAAGGTAGGGCC 
CCAAATGGTCAGAAATCTGATTGGATAATGCACGAATACCGTCTTGAGACCGATGAAAAT 
GGACCGCCTCATGAGGAAGGATGGGTGGTTTGTCGCGCTTTCAAGAAGAAGCTAACCACG 
ATGAACTACAACAATCCAAGAACAATGATGGGATCATCATCAGGCCAAGAATCTAACTGG 
TTCACG<^GCAAATGGATGTGGGGAATGGTAATTACTATCATCTTCCTGATCTAGAGAGT 
CCGAGAATGTTTCAAGGCTCATCATCATCATCACTAT^^ 

GACCCTTATGGTGTCGTACTCAGCACTATTAACGGAACCCCAACTAC?U^TAATGCAACGA 
GATGATGGTCATGTGATTACCAATGATGATGATCATATGATCATGATGAACACAAGTACT 
GGTGATCATCATCAATCAGGATTACTAGTCAAT 

TGGCAAACG CTTG ACAAGTTTGTTGCTTCTCAGCTAATCATGAGC CAAGAAGAGGAAGAA 
GTTAACAAAGATCC^TC^GATAATTCTTCGAATGAAACATTTCATCATCTCTCTGAAGAG 
CAAGCTGCAACAATGGTTTCGATGAATGCTTCTTCCTCTTCTTCTCCATGTTCCTTCTAC 
TCTTGGGCTCAAAATACACACACGTAA 

>G1354 Amino Acid Sequence (domain in AA coordinates: TBD) 
MESLAHIPPGYRFHPTDEELVDYYliKl^ 

GEEREWYFF SHKDKKYPTGTRTNRATGSGFWKATGRDKAI YS KQEIiVGMRKTLVFYKGRA 
PNGQKSDWIMHEYRLETDENGPPHEEGWVVCRAFKKKLTTMNYlTbTPRTM SGQESNW 
FTQQMDVGNGNYYHIjPDLESPRMFQGSSSSSLSSLHQNDQDPYGVVLSTINATPTTIMQR 
DDGHVITl^DDHMIMMNTSTGDHH^ 

VNKDPSDNSSNETFHHLSEEQAATMVSMNASSSSSPCSFYSWAQNTHT* 
>G1360 (1..1257) 

ATGGGAGATAGAAACAACGACGGTGATCAGAAAATGGAGGATGTATTGTTGCCCGGATTT 
AGGTTTCATCCAACCGACGAAGAGCTCGTAAGCnTCTACCTGAAGCGGAAGGTTCAACAC 
AACCCTCTCTCCATTGAGCTCATAAGACAACTCGA^^ 

CTTCCAAAGTTTGCGATGACGGGTGAAAAAGAATGGTACTTTTATTGTCCAAGGGA 

AAGTATAGGAACAGC^CGAGGCCAAACCGAGTGACCGGAGCTGGTTTTTGGAAAGCCACG 

GG AACGGACCGGCCGATATACTCGTCAGAAGGAAACAAATGCATAGGTTTAAAGAAGTCC 

TTAGTGTTCTACAAAGGAAGAGCAGCGAAAGGAGTTAAGACTGATTGGATGATGCATGAG 

TTTCGTTTGCCTTCTCTCTCCGAACCATCT 

GTCTCTCCCAACGATTCATGGGCTATATGC^^ 

ATGTCTAACCAAAAGCAATGAAACACATACCATTTTTCTTCAGA 

AGCTCTCACTTCCAGTTTCACCTVTGAGAATATGAACACTCCCAAAACTAGTAATAGTACA 
ACTCCATCCGTTCCCACTATAAGTCCCTTCT^ 
CCGACCAACGTTTTCAATCCGGTTTCATGTT^ 
CTTGCCACACAAGAAACACAACCTCAGTTTCCCAGGCTCCC 

TCGTTTCTGCTAAACACGTCTTCAGATTCGACCTTCTTGGGAGAATTCACGAGCCATATC 
GACCTCAGCGCAGTGTTGGCCCAAGAGCAATGTCCGCCGCTTGTAAGCCTACCACAGGAG 
TATGAAGAGACGGGATTCGAAGGAAATGGTATAATGAAGAACATGCGTGGTTCCAATGAA 
GATCATCTTGGTGATCATTGCGACACACTTCGGTTTGATGATTTCACTTCAACAATTAAT 
GAGAACCATCGTCATCATCAAGACCTGAAACAGAACATGACATTGCTGGAGAGTTATTAT 
TCTTCTTTTATCGTCCATCAATAGCGA 

>G1360 Amino Acid Sequence (conserved domain in AA coordinates : 18-174) 
MGDRNlTOGDQKMEnVLLPGFRFHPTDEELVSF YLKRKV S IEL IRQLD I YKYDPWD 

LPKFAMTGEKEWYFYCPRDRKYRNSSRPNRVTGAGFWKATGTDRP I YS SEGNKC IGIiKKS 
LVPYKGRAAKGVKTDWMMHEFRLPSLSEPSPPSKRFFDSPVSPITOSWAICRIFKKTNTT^ 
LRALSHSFVSSLPPETSTDTMSNQKQSNTYHFSSDKILKPSSHFQFHHENMNTPKTSNST 
TPSvTTISPFSYLDFTSYDKPTNVFNPVSCLDQQYLTNLFIiATQETQPQFPRIiPSSNEIP 
SFLIiNTSSDSTFLGEFTSHIDLSAVLAQEQCPPIiVSIiPQEYQETGFEGNGIMKNMRGSNE 
DHLGDHCDTLRFDDFTSTINENHRHHQDLKQNMTIjLE S YYSSL S S INSDLPACFS STT * 
>G1364 (1..537) 

ATGGCGGAGTCGCAGGCCAAGAGTCCCGGAGGCTGTGGAAGCCATGAGAGTGGTGGAGAT 
CAAAGTCCCAGGTCGTTACATGTTCGTGAGCAAGATAGGTTTCTTCCGATTGCTAACATA 
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AGCCGTATCATGAAAAGAGGTCTTCCTGCTAATGGGAAAATCGCTAAAGATGCTAAGGAG 

ATTGTGCAGGAATGTGTCTCTGAATTCATCAGTTTCGTCACCAGCGAAGCGAGTGATAAA 

TGTCAAAGAGAGAAAAGGAAGACTATTAATGGAGATGATTTGCTTTGGGCAATGGCTACT 

TTAGGATTTGAAGACTACATGGAACCTCTCAAGGTTTACCTGATGAGATATAGAGAGGGT 

GACACAAAGGGATCAG CAAAAGGTGGGGATC CAAATGCAAAGAAAGATGGGCAATCAAGC 

CAAAATGGCCAGTTCTCGCAGCTTGCTCACCAAGGTCCTTATGGGAACTCTCAAGTAACT 

TTTCCTCTCTTCTCTTCA(^CTCAAGCAATACGCATCATTCTCTTCTAATTTGTT^ 

>G1364 Amino Acid Sequence (conserved domain in AA coordinates : 2 9-120) 

MAESQAKS PGGCGSHE SGGDQS PRS LHVREQDRFLPIANI SRIMKRGLPANGKI AKDAKE 

IVQEWSEFISFVTSEASDKCQREKRKTINGDDL^ 

DTKGSAKGGDPNAKKDGQSSQNGQFSQLAHQGPYGNSQWFPLFSSHSSNTHHSLLIC* 
>G1379 (68.. 622) 

CTCTGCCTCTCTCTCTCTCTCAAAACCCATCTCGAAAGTCTTTCTCTTTCGAGGGTTTAG 

ATCCTCCATGGAAGGCGGCGGAGTTGCTGACGTGGCTGTCCCCGGTACGAGGAAGAGAGA 

CAGACCTTACAAAGGAATTAGGATGAGGAAGTGGGGAAAGTGGGTGGCGGAGATTCGTGA 

GCCTAACAAGCGCTCTAGGTTATGGCTTGGCTCTTACTCTACTCCCGAGGCGGCGGCGCG 

AGCTTACGACACGGCGGTTTTCTATCTTAGAGGACCTACGGCGAGGCTTAACTTCCCTGA 

GCTTCTTCCTGGGGAGAAATTCTCCGACGAGGATATGTCX3GCTGCGACCATCAGGAAGAA 

AGCCACGGAGGTCGGTGCTCAGGTTGATGCTTTGGGCACGGCGGTGCAAAATAACCGCCA 

CCGTGTTTTTGGTCAGAATCGAGATAGTGATGTGGATAATAAGAATTTTCATCGGAATTA 

TC7VAAACGGTGAACGAGAAGAAGAAGAAGAAGATGAGGATGACAAGAGATTGAGGAGTGG 

CGGCCGGTTATTGGATCGGGTTGACTTGAATAAATTACCCGACCCGGAAAGCTCCGATGA 

AGAATGGGAAAGCAAAGATTAAAAATATATAGTTTGGAGCGGTGGCTGTTGCTAAC GTAC 

GCC^CGGCTTGCTTCrrACGAATCATTAGCGCCGTTTATGATTTTTTTTTTT^ 

CATTATCTGAAAATTTAGGGCTTTTTAGTTATTAATTTTTGTTTTGTTTTTTTCCTT 

TGCGAGTTTTGCGGTTTATGGAATTTTAGGOTATTGCTTAACGAAAAAAAAAAAAA 

>G1379 Amino Acid Sequence (domain in AA coordinates: 18-85) 

MEGGGVADVAv^GTRKIS}RPY 

DTAVEYLRGPTARIiNFPELLPGEKFSDEDMSAAT^ 

FGQNRDSDVDNKNFHRNYQNGEREEEEEDE^ 

ESKH* 

>G1384 (33.. 977) 

GTACATTTTTTTTTGTATTTCAGGAAACTCCGATGGCGGATCTCTTCGGTGGTGGCCACG 
GCGGCGAGCTTATGGAAGCACTTCAACCTTTTTACAAAAGTGCTTCCACGTCTGCTTCAA 
ATCCTGCGTTTGCGTCCTCAAACGATGCGTTTGCGTCTGCCCCAAACGACCTATTTTCTT 
CTTCTTCTTACTATAATCCTC^TGCATCTTTATTCCCTTCAC^TTCCACAACCTCTTACC 
CGGATATTTATTCTGGATCCATGACCTATCGATCTTCATTCGGGTCGGATCTTCAACAAC 
CCGAAAACTACCAATCTGAGTTCCATTACCAAAACACTATCACTTACACTCACCAAGACA 
ACAACACTTGCATGCTTAACTTCATTGAGCCGAGCC^CCGGGTTTTATGACCCAACCGG 
GTCCGAGTTCGGGTTCGGTTTCAAAACCGGCTAAGCTCTATAGAGGAGTGAGGCAAAGAC 
ATTGGGGAAAATGGGTCGCGGAGATCCGTTTACCCAGGAACCGAACCCGACTTTGGCTCG 
GAACATTCGACACGGCTGAAGAAGCCGCGTTGGCTTATGATCGCGCCGCGTTTAAGCTTC 
GTGGTGACTCGGCTCGGCTTAACTTCCCAGCTCTCCGATACCAAACCGGCTCGTCTCCGT 
CTGATACCGGCGAATATGGTCCTATTGAAGCTGCCGTAGACGCTAAACTAGAAGCCATAT 
TAGCTGAGCCGAAGAATCAGCCGGGCAAAACGGAGAGGACGTCGAGGAAACGAGCTAAAG 
CCGCGGCTTCTTCAGCTGAGCAGCCGTCAGCGCCACAACAACATTCCGGGTCGGGTGAAA 
GTGATGGGTCGGGTTCACCGACTTCGGATGTTATGGTGCAGGAGATGTGCCAAGAGCCAG 
AGATGCCATGGAATGAAi^TTTC^TGCTCGGCAAGTGTCCTTCTTATGAGATAGATTGGG 
CTTCAATTTTATCGTGAAAAATTAGGATTCAATTCATTTTTATTCATTTTAACTTGTTTG 
TATTTTCTTTTAAACTTTAGGGTTATTAGCTGTGCGTAA 

>G1384 Amino Acid Sequence (domain in AA coordinates: TBD) 
MADLFGGGHGGEIiMEALQPFYKSASTSASNPAFASSNDAFASAPNDLFSSSSYYNPHASL 
FPSHSTTSYPDIYSGSMTYPSSFGSDLMPEl^QSQFHYQNTlTyTHQDNOTCMIiNFIEP 
SQPGFMTQPGPSSGSVSKPAKLYRGVTIQRHWGKWVAE 

AYDRAAFKLRGDSARLNFPALRYQTGSSPSDTGEYGPIQAAVl^AKLEAII^ 
ERTSRKRAKAAASSAEQPSAPQQHSGSGESDGSGSPTSDVMVQEMCQEPEMPWNENFMLG 

KCPSYEIDWASILS* 
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>G1399 (261.. 1475) 

AGGTCGAATTTTCTGAAATTAAGATTCATTCCTC 

CTTTAGCTTAGCTTAGCTTCTACTGATCTGTTTTTGCTACAAAATCCCATCTITIU'CTTT 

AA7UVCTCTTTATCTCTGAATCTTGAGTTTCTTGTAGAAGAAGAAGCAATTTTGAATCTTT 

CGTAATCATAAAGATTCGTGGAGGATCTCTACTGATTTGTCGGAATCTCTCACTACAGAA 

TCACTTGATCTTATGTCCGGATGGAGGAGAGAGAAGGAACCAACATCAACAACAAC^ 

CTAGCAGTTTCGGCTTGAAGCAGCAACATGAAGCTGCTGCTTCTGATGGTGGTTACTCAA 

TGGACCCACCACCAAGACCCGAAAACCCTAACCCGTTTTTAGTCCCACCCACTACTGTCC 

CCGCGGCCGCCACCGTAGCAGCAGCTGTTACTGAGAATGCGGCTAOTCCGTTTAGCTTAA 

CAATGCCGACGGAGAACACTTCAGCTGAGCAGCTGAAAAAGAAGAGAGGTAGGCCGAGAA 

AGTATAATCCCGATGGGACTCTTGTCGTGACTTTATCGCCGATGCCAATCTCGTCCTCTG 

TTCCGTTGACGTCGGAGTTTCCTCCAAGGAAACGAGGAAGAGGACGTGGCAAGTCTAATC 

GATGGCTCAAGAAGTCTCAAATGTTCCAATTCGATAGAAGTCCTGTTGATACCAATTTGG 

CAGGTGTAGGAACTGCTGATTTTGTTGGTGCCAACTTTACACCTCATGTACTGATCGTCA 

ACGCCGGAGAGGATGTGACGATGAAGATAATGACATTCTCTCAACAAGGATCTCGTGCTA 

TCTGCATCCTTTCAGCTAATGGTCCCATCTCC^^ 

CCGGTGGTACTCTAACTTATGAGGGTCGTTTTGAGATTCTCTCTTTGACGGGTTCGTTTA 
TGCAAAATGACTCTGGAGGAACTCGAAGTAGAGCTGGTGGTATGAGTGTTTGCCTTGCAG 
GACC^GATGGTCGTGTCTTTGGTGGAGGACTCGCTGGTCTCTTTCTTGCTGCTGGTCCTG 
TCCAGGTAATGGTAGGGACTTTTATAGCTGGTC^^ 

AAGAAAGACGGCTAAGATTTGGGGCTC^^CCATCTTCTATCTCCTTTAACATATCCGCAG 

AAGAACGGAAGGCGAGATTCGAGAGGCTTAACAAGTCTGTTGCTATTCCTGCACCAACCA 

CTTCATAC^CGCATGTAAACACAACAAATGCGGTTCACAGTTACTATACAAAC 

ACCATGTCAAGGATCCCTTCTCGTCTATCCCAGTAGGAGGAGGAGGAGGTGGAGAGGTAG 

GAGAAGAAGAGGGTGAAGAAGATGATGATGAATTAGAAGGTGAAGACGAAGAATTCGGAG 

GCGATAGCCAATCTGACAACGAGATTCCGAGCTGATGATGATCATACGGTTTCTTTTCGC 

GGATTTGTTAGGTTTGATGGATTTCAGATTTTGGTTGATTGTTTTTATTAACACAGAATG 

TTTAGAAGCTGCTATCTTTAGGTTCCCATCCTCTTGTGATTGTTGAGTATCCTTGTTAGA 

AACAAACTTACTGTTGCAAAACTCTCTTCAAAAAAGTTTCAC^ 

>G1399 Amino Acid Sequence (domain in AA coordinates: 86-93) 
MEEREGTNINNNITS S FGLKQQHEAAASDGGYSMDPPPRPENPNPFLVPPTTVPAAATVA 
AAVTENAATPFSLTMPTENTSAEQLKKKRGRPRK^ 
PPRKRGRGRGKSNRWLKI<SQMFQFDRSPVDTmxAGVGTADFVGAN 
MKIMTFSQQGSRAICILSANGPISNWLRQSMT^ 

TRSRAGGMSVCLAGPDGRVFGGGIjAGIjFIiAAGPVQVIWGTFIAGQEQSQLEIiAKERRIjRF 
GAQPSSISFNISAEERKARFERIjNKSVAIPAPTTSYTHVNT^ 
SSI PVGGGGGGEVGEEEGEEDDDELEGEDEEFGGDS QSDNE I PS * 
>G1415 (60.. 680) 

CCTTATCACTCACCAAAAGTCGTCACATAATATCACTTTCGAGTTATCAACATCCGTAC^ 
TGTCATCCATAGAGCC^UVAAGTAATGAT^ 

AAGCTAGTTCGAGGAAAGGTTGTATGAGAGGAAAAGGTGGACCCGATAACGCGTCTTGCA 

CTTACAAAGGTGTTAGACAACGCACrTGGGGCAAATGGGTCGCTGAGA 

ACCGAGGAGCTCGTCTTTGGCTCGGTACCTTCGACACCTCCCGTGAAGCTGCCTTGGCriT 

ATGACTCCGCAGCTCGTAAGCTCTATGGGCCTGAGGCTCATCTCAACCTCCCTGAGTCCT 

TAAGAAGTTACCCTAAAACGGCGTCGTCTCCGGCGTCCCAGACTACACCAAGCAGCAACA 

CCGGTGGAAAAAGCAGCAGCGACTCTGAGTCGCCGTGTTCATCCAACGAGATGTCATCAT 

GTGGAAGAGTGACAGAGGAGATATCATGGGAGCATATAAACGTGGATTTGCCGGTAATGG 

ATGATTCTTCAATASGGGAAGAAGCTACAATGTCGTTAGGATTTCCATGGGTTCATGAAG 

GAGATAATGATATTTCTCGGXTTGATACTTGTATTTCCGGTGGCTATTCTAATTGGGATT 

CCTTTCATTCCCCACTTTGAGGTGTCACTAGACTCTCTTTAATTGTTAAGTTATGATATA 

CAAACTACATATATATACAAATATAGTCACCGTGAACTAGGATATATATGTAAATAAACA 

CCAGTTACATGTACTTATATATGTGCACATCTATATATGTGGTTTGTCTGTATAGTGTGA 

AAGCAGATTCTTAC CATATCA 

>G1415 Amino Acid Sequence (domain in AA coordinates: TBD) 
MSSIEPKVMMVGANKKQRTVQASSRKGCMRGKGGPDNASCTO^ 

NRGARLWLGTFDTSREAAIiAYDSAARKLYGPEAHLNLPESLRSYPKTASSPASQTTPSSN 
TGGKSSSDSESPCSS1TOMSSCGRVTEEISWEHINVDLPVMDDSSIWEEATMSLGFPWVHE 
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GDND I SRFDTC I SGGYSNWDS FHS PL * 
>G1417 (32.. 1501) 

TCTATCTCTATCTATCTCTCTTTGTCTGCAAATGGAAGAACATATTCAAGATCGCCGTGA 
AATTGCGTTCTTACACTC^GGAGAATTTCTC 

ACCGAACGAGTCTCCGGTGGAACGTCATCACGAGTCGTCTATCAAAGAAGTTGATTTCTT 

CGCTGCTAAAAGTCAGCCGTTTGATCTTGGTCATGTGAGAACAACGACGATCGTTGGATC 

ATCTGGTTTTAATGATGGATTAGGTTTGGTAAATTCATGTCATGGAACATCAAGCAATGA 

TGGCGATGACAAAACCAAAACTCAAATTAGTAGACTGAAGTTGGAGCTAGAGAGGCTTCA 

CGAGGAGAATCACAAACTGAAGCATTTATTAGATGAGGTCAGTGAGAGTTACAACGACCT 

CCAAAGAAGAGTTTTGTTAGCAAGACAAACACAAGTGGAAGGTCTTCA 

TGAGGATGTACCTCAAGCTGGTTCCTCACAAGCTCTAGAGAACAGAAGACCAAAGGATAT 

GAACCATGAAACTCCGGCCACCACCTTGAAACGACGGTCTCCAGACGACGTGGATGGTCG 

TGATATGCACCGAGGATCACCAAAAACTCCTCGAATAGACCAAAACAAGAGTACTAATCA 

TGAAGAACAACAAAACCCTCATGATCAATTACCCTATAGAAAAGCTAGGGTTTCCGTTAG 

AGCTAGATCTGATGCCACTACGGTAAATGACGGATGTCAATGGAGAAAATACGGTCAGAA 

AATGGCGAAAGGGAATCC^TGTCCTCGCGCTTATTATCGTTGCACCATGGCCGTTGGATG 

TCCTGTCCGTAAACAGGTCCAACGATGCGCGGAGGATACAACTATCTTGACAACAACGTA 

CGAAGGAAACCATAACCATCCTCTTCCCCCGTCAGCCACAGCCATGGCTGCAACCACCTC 

CGCCGCAGCAGCCATGCTCTTATCAGGCTCCTCCTC 

TAGCCCCTCCGCCACGTCATCATCATCCTTCTACCATAACTTCCCATACACCTCCACAAT 
CGC^CACTCTCTGCCTCAGCTCCTTTC^ 

TCGACCGCTACAACCGCCACCGCAGTTTCTAAGCCAGTATGGTCCCGCCGCGTTTTTACC 

AAACGCTAATCAAATTAGGTCTATGAATAATAATAACCAGCAGTTATTAATACCTAATTT 

GTTTGGCCCACAAGCCCCACCACGTGAAATGGTCGATTCAGTTAGGGCTGCGATTGCGAT 

GGATCCGAACTTCACGGCGGCACTTGCGGCCGCGATCTCAAACATTATCGGAGGAGGTT^A 

TAACGACAACAATAATAATACTGATATTAATGATAACAAGGTTGATGCAAAAAGTGGAGG 

GAGTAGTAACGGAGATTCGCCACAGCTTCCTCAGTCTTGCACCACTTTCTCTACAAACTA 

ATTTTACTACCATTATTATATGTTATCTTATTATATATTACACAC^ 

TGCGTATCTTAAGTTTTTTTTTGGGGGCCATTATATATGAATGATATGGAGATCACTGA 

AGAGAGAGAGAGCTATTATGGGTTTTTTTTT 

>G1417 Amino Acid Sequence (domain in AA coordinates: 239-296) 
MBEHIQDRREIAPLHSGEFIjHGDSDSKDHQPNESPVERHHESSIKEVDFFAAKSQPFDLG 
HVRTTTI VGS SGFNDGLGLVNS CHGTS SNDGDDKTKTQ I SRLKLELERI1HEENHKLKHI1L 
DEVSESYNDLQRRVXjLARQTQVEGLHHKQHED^ 

RRS PDDVDGRDMHRGS PKTPRIDQNKSTNHEEQQNPHDQLPYRKARVSVRARSDATTVND 
GCQWRKYGQKMAKGNPCPRAYYRCTMAVGCPW 

SATAMAATTSAAAAMIiLSGSSSSNIiHQTIiSSPSATSSSSFYHNFPYTSTIATLSASAPFP 

TITLDLTNPPRPLQPPPQFLSQYGPAAFLPNANQIRSMWNNNQQLLIPNLFGPQAPPREM 

VDSVRAAIAMDPNFTAALAAAISNIIGGGNITON^^ 

QSCTTFSTN* 

>G1442 (1..1293) 

ATGGGAACAAGAGCAGAACGCAAGGAAGATTTTGTTGGTGGGTTTGGATTTGGTGCT 

GAAAATTCGCATAAAGACGTTATGGTGCTACCTCATCATCACTATTATCCATCATATTCA 

TCACCTTCCTCTTCTTCTTTGTGTTACTGTTCTGCTGGTGTTAGCGATCCCATGTTCTCT 

GTTTCTAGCAATCAGGCTTACACTTCTTCTCACAGTGGTATGTTCACACCCGCCGGTTCT 

GGTTCTGCTGCTGTGACTGTAGCAGATCCTTTTTTCTCCTTGAGCTCTTCAGGGGAAATG 

AGAAGAAGTATGAACGAAGATGCTGGTGCAGCTTTCAGCGAAGCTCAATGGCATGAGCTT 

GAGAGGCAGAGGAA^ATATACAAGTACATGATGGCTTCTGTTCCTGTTCCTCCAGAGCTT 

CTGACACCCTTTCCGAAGAACCACCAATC^^ 

GCGACAGGAGGCTCATTGC^GCTGGGGATTGCTTCAAGCGCAAGCAATAACACGGCTGAT 
CTGGAGCCATGGAGGTGCAAGAGAACAGATGGGAAGAAATGGAGGTGCTCTAGAAACGTG 
ATTCCTGATCAGAAATACTGTGAGAGACACACACACAAGAGCCGTCCTCGTTCAAGAAAG 
CATGTGGAATCATCTCACCAATCATCTCACCACAATGACATTCGTACGGCTAAGAATGAT 
ACTAGCCAGCTTGTGAGAACTTATCCTCAGTTTTACGGACAACCTATAAGCCAGATCCCT 
GTGCTTTCTACTCTTCCGTCTGCCTCCTCTCCATATGATCACCACAGAGGACTGAGGTGG 
TTTACGAAAGAAGATGATGCCATTGGAACCTTAAACCCGGAGACTCAAGAAGCTGTCCAG 
CTGAAAGTTGGATCAAGCAGAGAGCTCAAACGGGGATTCGATTATGATCTGAATTTCAGG 
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CAGAAAGAGCCAATAGTAGACCAGAGCTTTGGAGC^TTGCAGGGTCTATTAAGTCTAAAC 

GCGATGGGAAGCTCTCTGACACTCTCAATGGOTGGAGGAGGCATGGAGGAAACAGAGGGA 
ACAAACCAGCATCAGTGGGTTAGCCATGAAGGTCCATCATGGCTCTATTCAACAACACCA 
GGTGGACCATTGGCTGAAGCACTGTGTCTCGGTGTCTCCAACAACCCAAGTTCTAGTACT 
ACTACTAGTAGCTGCAGCAGAAGCTCAAGCTAA 

>G1442 Amino Acid Sequence (domain in AA coordinates: 172-223) 
MGTRAERKEDFVGGFGFGVVENSHKDVIWLPH^ 

VSSNQAYTSSHSGMFTPAGSGSAAVTVADPFFSLSSSGEMRRSMNBDAGAAFSEAQWHEL 

ERQRNIYKYMMASVPVPPELLTPFPKNHQSN™ 

LEPWRCKRTDGKK^CSRNVIPDQKYCERHTHKS 

TSQLVRTYPQFYGQPISQIPVLSTLPSASSPYDHHRGLRWFTKEDDAIGTLNPETQEAVQ 

LKVGSSRELKRGFDYDLNFRQKEPIVDQSFGALQGLI»S^ 

AMGSSLTLSMAGGGMEETEGTNQHQWSHEGPSWL^^ 

TTSSCSRSSS* 

>G1454 (86.. 1180) 

CTAGTAGTGATGATATGATCGCTTCTTCTCCTACAATCTCAGAAACCTCCGATCACGGTT 

TTAGATATCTTCTACAACGGATACAATGGAGAGCACCGATTCTTCCGGTGGTCCACCACC 

GCCACAACCTAACCTTCCTCCAGGCTTCCGGTTTCACCCTACCGACGAAGAGCTTGTTGT 

TCACTACCTCAAACGCAAAGCAGCCTCTGCTCCTTTACCTGTCGCCATCATCG 

CGATCTCTATAAATTTGATCCATGGGAACTTCCCGCTAAAGCATCGTTTGGAGAACAAGA 

ATGGTACTTCTTTAGTCCACGAGATCGGAAGTATCCAAACGGAGCAAGACCAAACAGAGC 

GGCGACTTCAGGTTATTGGAAAGCGACCGGTACAGATAAACCGGTACTTGCTTCCGACGG 

TAACCAAAAGGTGGGCGTGAAGAAGG(^CTAGTCTTCTA(^GTGGTAAACCACCAAAAGG 

CGTTAAAAGTGATTGGATCATGCATGAGTATCGTCTCATCGl^AAACAAACCAAACAATCG 

ACCTCCTGGCTGTGATTTCGGCAACAAAAAAAACTCACTCAGACTTGATGATTGGGTG 

ATGTAGAATCTACAAGAAGAACAACGCAAGTCGACATGTTGATAACGATAAGGATCATGA 

TATGATCGATTAC^TTTT(^GGAAGATTCCTCCGTCTTTATCAATGGCGGCTGCTTCTAC 

AGGACTTC^CCAACATC^TCATAATGTCTC^GATCAATGAATTTCTTCCCTGGC^^CT 

CTCCGGTGGTGGTTACGGGATTTTCTCTGACGGTGGTAACACGAGTATATACGACGGCGG 

TGGCATGATCAACAATATTGGTACTGACTCAGTAGATCACGACAATAACGCTGACGTCGT 

TGGTTTAAATCATGCTTCGTCGTCAGGTCCTATGATGATGGCGAATTTGAAACGAACTCT 

CCCGGTGCCGTATTGGCCTGTAGCAGATGAGGAGCAAGATGCATCTCCGAGCAAACGGTT 

TCACGGTGTAGGAGGAGGAGGAGGAGATTGTTCGAACATGTCTTCCTCCATGATGGAAGA 

GACTCCACCATTGATGC^CAACTVAGGTGGTGTGTTAGGAGATGGATTATTCAGAACGAC 

ATCGTACCAATTACCCGGTTTAAATTGGTACT^CTTCTTAATCAAATGTGTTTCGCCGCCG 

GTGTGAAGAATTTTCCGGTGACAGTGAAGATTTTTTTCCGATTGGTGGGGTCATTTGCAT 

GCATTATATAATTTGAGATTTGTGTATATGTTTTGGGTTAATTAATTGGTCACAGGGGC 

>G1454 Amino Acid Sequence (conserved domain in AA coordinates : 9-178) 

MESTDSSGGPPPPQPNLPPGFRFHPTDEELVVHYLKRKAASAPLP 

ELPAKASFGEQEWYFFSPRDRKYPNGARPNRAATSGYWKATGTDKPVIiASDGNQKVGVKK 
ALVFYSGKPPKGVKSDWIMHEYRLIENKPNNRPPGCD^ 

ASRHVDNDKDHDMIDYIFRKIPPSLSMAAASTGIjHQHHHNVSRSMNFFPGKFSGGGYGIF 
SDGGNTS I YDGGGMINNI GTDS VDHDNNADVVGLNHAS S SGPMMMANLKRTLPVP YWP VA 
DEEQDAS PS KRFHGVGGGGGDC SNMS S SMMEETPPLMQQQGGVLGDGLFRTT S YQIjPGLN 
WYSS* 

>G1459 (1..1272) 

ATGATGAAAGGTCT-GATTGGGTATAGATTTAGTCCGACGGGAGAGGAAGTGATCAACCAT 
TACCTAAAGAACAAACTTCTGGGTAAGTATTGGCTCGTTGATGAAGCTATTAGCGAGATC 
AACATCTTGAGTCACAAACCCAGCAAGGATTTGCCTAAGTTAGCTAGGATCCAATCGGAA 
GATCTTGAATGGTATTTCTTCTCTCCGATTGAGTACACGAACCCGAATAAGATGAAAATG 
AAGAGGACGACAGGTTCTGGGTTTTGGAAACCTACTGGTGTTGATCGGGAAATTAGGGAT 
AAAAGAGGAAATGGTGTTGTGATAGGGATTAAGAAGACGCTTGTGTACCATGAAGGTAAG 
AGTCCTCATGGAGl^AGAACTCCTTGGGTTATGCACGAGTATCACATCACTTGCTTGCCT 
CATCIATAAGAGGA/^ATATGTTGTCTGCCAAGTAAAGTATAAGGGTGAAGCTGCAGAAATT 
TCATATGAGCCAAGTCCCTCTTTGGTATCCGATTCGCATACCGTCATAGCGATTACCGGA 
GAACCGGAACCTGAGCTTCAGGTTGAGCAGCCAGGTAAAGAAAATCTCTTGGGTATGTCT 
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GTAGATGATTTGATAGAACCAATGAACCAACAAGAGGAGCCACAAGGTCCTCACTTAGCT 
CCGAATGATGATGAGTTTATACGTGGATTGAGGCATGTTGATCGAGGGACGGTTGAATAT 
TTGTTTGCCAATGAAGAAAACATGGATGGTTTGTCTATGAATGACTTGAGAATCCCAATG 
ATCGTCCAACAAGAGGATCTCTCTGAGTGGGAGGGATTTAACGCAGACACCTTTTTCAGC 
GACAACAACAATAACTATAACCTTAACGTGCATCATC7UVCTAACGCCTTACGGCGATGGC 
TATTTGAATGCATTTTCGGGTTATAACGAAGGGAATCCTCCCGATCACGAATTAGTGATG 
CAAGAGAACCGCAACGATCACATGCCAAGGAAACCTGTGACAGGGACCATTGATTATAGC 
AGCGATAGTGGCAGTGATGCTGGATCCATATCTACAACGGTGAAACAAGAAATCCCAAGA 
GCTGTTGATGCACCCATGAACAATGAGTCATCTTTGGTGAAAACAGAGAAGAAAGGCTTG 
TTTATTGTAGAGGACGCAATGGAGAGAAACCGCAAGA^CCACGATTTATCTATCTCATG 
AAGATGATCATAGGCAA.CATCATATCGGTTTTACTACCCGTCAAAAGATTGATCCCGGTG 
AAGAAGTTATGA 

>G1459 Amino Acid Sequence (conserved domain in AA coordinates : 10-152) 
MMKGLIGYRFSPTGEEVINHYLKNKLLGKY^ 

DLEWYFFSPIEYTNPNKMKMKRTTGSGFWKPTGVDREIRDKRGNGWIGIKK^ 

SPHGVRTPWVMHEYHITCLPHHKRKYVVCQVKYXGEAAEI S YEPS PSLVSDSHTVIAITG 

EPEPEa^QVEQPGKENLLGMSVT)DLIEPMNQQEEPQ^ 

LFANEENMDGLSMNDLRIPMIVQ 

YLNAFSGYNEGNPPDHELVMQENRITO 

AVDAPMNNES SLVKTEKKGLFIVEDAMERNRKKPRFI YLMKMI IGNI I SVLLPVKRL I PV 
KKL* 

>G1460 (87.. 995) 

CGTCGACCTTCACTCAAACCCTAATCCCGGGAACC 

TTTCGATCTGTTTCTATTTTAAAAAGATGATGAAAGATCCGACTGGGTATAGATTTAGTC 

CGACGGGAGAGGAAGTGATAAACCATTACCTAAAGAACAAAATTCTGGGTAAGACTTGGC 

TCGTTGATGAAGCCATTAGCGAGATCAACATCTTGAATCACAAACCCA 

CTAAGTTAGCTAGGATCCAATCGGAAGATCTTGAGTGGTACTTTTTCTCTCCGATTGAGT 

ACACGAACCCGAATAAGATGAAAATGAAGAGGACGACAGGTTCTGGGTTTTGGAAACCTA 

GTGGTGTTGATCGGAAAATTAGGGATAAAAGAGGAAATGGTGTTGTGATAGGGATTAAGA 

AGACGCTTGTGTACCATGAAGGTAAGAGTCCTCATGGAGTTAGAACTCCTTGGGTTATGC 

ACGAGTATCACATCACTTGCTTGCCTCATCATAAGAGGAAATATGTTGTCTGCCAAGTAA 

AGTATAAGGGTGAAGCTGCAGAAATTTCATATGAGCCAAGTCCCTCTTTGGTATCCGATT 

CG(^TACCGTCATAGCGATTAACGGAGAACCGGAACCTGAGCTTC^GGTTGAGC^GCCAG 

GTAAAGAAAATCTCTTGGGTATGTCTGTAGATGATTTGATAGAACCAATGAACCAACAAG 

AGGAGCCACAAGGTCCTCACTTAGCTCCGAATGATGATGAGTTTATACGTGGATTGAGAC 

ATGTTGATCGAGAGCCGGTTGAATATTTGTTTGCCAATGAAGAAAACATGGATGGTTTGT 

CTATTATGAATGACTTGACAATCCCAATGATCGCCCAACAAGAGGATCTCATTCTCTCTG 

AGTGGGAGGGATTTATCGCAGCCACCTTTTTC^GCGACAAC^CAATAAC^ATAACCTTA 

ACGTGCATCAACTAACGTCTTTCTTACCGGGATGATTATC^GAATGCATTTTGGGTTACA 

ACGGAGCGNCCGCT 

>G14 60 Amino Acid Sequence (domain in AA coordinates: TBD) 

MMKDPTGYRFSPTGEEVINHYLKNKILGKTWLVDK 

DLEWYFFSPIEYTNPNKMKMKRTTGSGFWKPSGVIDRK^ 

SPHGV^TPWVMHEYHITCLPHHKRKYVVCQVKYKGEAAEISYEPSPSLVSDSHTVIAIN^ 
EPEPELQVEQPGKENLLGMSVDDIjIEPMNQQEEPQGPHLAPNDDEFIRGLRHVDREPVEY 
LFANEENMDGLS IMNDLTI PM I AQQEDL IIiSEWEGF I AATFFSDNlSnsnsn^LNVHQLTSFL 
PG* 

>G147 (37. .672^ 

AAATCATCAGATAGAAGGAAATATTCTGATTGAGAGATGGCTCGTGGAAAGATTCAGCTT 

AAGAGGATTGAGAACeCGGTTCACAGACAAGTGACTTTTTGCAAGAGGAGAACTGGTCTT 

CTCAAGAAGGCTAAGGAGCTCTCTGTGCTCTGTGATGCCGAGATCGGTGTTGXGATCTTC 

TCTCCTCAGGGCAAGCTCTTTGAGCTCGCTACTAAAGGAACAATGGAGGGAATGATTGAT 

AAGTACATGAAGTGTACTGGTGGTGGTCGTGGTTCTTCTTCTGCTACTTTTACTGCTCAA 

GAACAACTTCAACCACCAAATCTTGATCCGAAAGATGAGATCAACGTGCTTAAGCAAGAG 

ATTGAGATGCTTCAGAAAGGGATAAGCTATATGTTTGGAGGAGGAGATGGGGCTATGAAT 

CTTGAAGAACTTCTTTTGCTTGAGAAGCATCTTGAGTATTGGATTTCTCAGATTC 

GCTAAGATGGATGTTATGCTTCAAGAAATTCAGTCATTGAGGAACAAGGAAGGAGTCCTC 
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AAAAACACCAACAA.GTATCTCCTCGACAAGATAGAGGAAAACAACAATAGCATATTAGAT 
GCTAACTTCGCAGTCATGGAGACAAACTATTCCTATCCGCTAACAATGCCAAGTGAAATA 
1TTCAGTTCTAGACCATAGGGTATTTGAAGACTATGTCTCACGAATTTAAATAACCTTGG 
TAAGTATAATATAGTGTTGTTAAATCACACATAATTAAAATAAAGCCTGTGGAACTTCGC 
TAGGCAGTTGAAAATCTATCCGTATGTTTTATCCTCTTGTTTTAC^ 

GATGAAATGACTGCAAGTGTGGTGTGTACTTATAACTCTTTCTACTTTCTATCTATGTTT 
TGAATTTATGGATT 

>G147 Amino Acid Sequence (domain in AA coordinates: 2-57) 

MARGKZQLKRIKNPVHRQVTFCKRRTGLLKKAKELSVLCDAHIGWIFSPQGKD 

GTMEGMIDKyMKCTGGGRGSSSATFTAQEQLQPPNIjDPKDEINVIjKQEIEMIjQKGISYMF 

GGGDGAM^EELLLIiEKHIiEYWISQIRSAiayiDW^ 

ENNNSILDANFAVMETNYSYPLTMPSEIFQF* 

>G1471 (1..735) 

ATGGAGAACCAATCTATGTCTTCATCAAGCTCCTCCACACACAAACATGATCAAAAACTC 

AAAAGTTCCGTTGTGGCCATGGAGGTCCTGGAGGAGAAGGAGACAGTGAACAATCCGCCC 

CAGT ATTATAATAAGATCTACATCTGTTACTTGTGCAAGAGAGCGTTC CCAACCCC TCAT 

GCCCn*TGGCGGTCACGGAACCACCCACAAGGAGGACCGAGAATTGGAGAGGCAACAGATC 

GAGTCAAGGCTTTCTAACAAAGACAAGTCTAACTTGCTCTTTGGTGGGTCTTCACAAGAT 

GTTTTATCAAATGATAATCACCTTGGACTCTCTCTTGGTCCA.TTGAAGTCCATAGAAGGT 

AGCAGCAGCAGCAAGAACGTTAACCCATTGCTTAATGTTGGAG 

GATATGAACATGAACAACTATAGCTCACATGCT 

CTTACTCTTGGTCCATCTAAGTCCATAGGAGATAGCAACAATATCATTAATAACAACACT 
AACTGATCCTTCGATGGGAATCTGATCATTCCCGTTCGTCCTCGTGTGTCTAGATACCAT 
TTTGTTGCTGGGAACCCCCTTGATTCAATCTCTAGAAACATTCCTCCTTCTATTACTTTT 
CCTCATCTAAAC^TCAATCTTTCTCATGATO^ 

TCTAGTCACTCATAA > . 

>G1471 Amino Acid Sequence (domain in AA coordinates: 49-70) 

MENQ SMSSSSSS THKHDQKLKS S WAMEVLEEKETVNNP PQ YYNKI Y I C YLCKRAFPTPH 

ALGGHGTTHKEDRELERQQIESRLSNKDKSNL^ 

SSSSNlHTNPLIiNVGVPRGTTDN^^ 

NSSFDGNLI I PVRPRVSRYHFVAGNPLDS ISRNI PPS ITFPHIiNINLSHDSFSLQENGSG 
SSHS* 

>G1475 (1. .645) 

ATGAAGAGAACACATTTGGCAAGTTTTAGTAACAGAGACAAAAGCCAAGAAGAAGAAGGA 
GAAGACGGTAATGGTGACAACAGAGTCATCATGAATCACTACAAGAATTACGAAGCTGGG 
CTGATCCCATGGCCTCCCAAGAATTACACTTGCAGCTTCTGCAGGAGAGAGTTCAGATCT 
GCTGAAGCACTTGGAGGCGACATGAATGTTCATAGAA 

ATCCCTTCTTGGCTCTTCGAACCTCACCACCACACIACCTATTGCAAACCCTAACCCTAAT 

TTTAGCTCTTCTTCTTCCTCTTCAACAACAACAGCTCATCTTGAGCCTTCCCTAACCAAC 

CAGAGATCCAAAACCACTCCTTTTCCTTCTGCCCGGTTTGATCTTTTGGACAGTACTACT 

AGCTATGGAGGTTTGATGATGGACAGAGAGAAGAACAAGAGCAATGTATGTAGCAGAGAG 

ATGAAGAAAAGTGCCATCGATGCATGTCATTCAGTAAGATGTGAGATAAGCCGTGGGGAT 

CTGATGAATAAGAAAGATGATCAAGTCATGGGGTTGGAGCTTGGGATGAGTTTGAGGAAT 

CCCAACCAAGTTCTTGATTTGGAGCTTCGACTAGGCTACCTCTAA 

>G1475 Amino Acid Sequence (domain in AA coordinates: 51-73) 

MKRTHLASFSNRDKTQEEEGEDGNGDNRVIMN^ 

AQALGGHMNVHRRDRAKLRQIPSWLFEPHHHTPIANPNPN^ 

QRSKTTPFPSARFI>iiLDSTTSYGGLMMDREKlTKSNVCSREIKKSAIDACHSVRCEISRGD 
LMNKKDDQ VMGLELGMS LRNPNQVIjDLELRLGYIj * 
>G1477 (1. .606) * 

ATGTTGTCCTCGGACTCGAATTACGCTAGTGATATTAGCGACGATGCCTCCGCCACCGGA 
TCGATAGAGAATCCTATATACAAATGCAAGTATTGTCCTAGGAAGTTCGATAAAACACAA 
GCATTAGGTGGTC^TCAAAATGCACACAGAAAGGAGAGAGAGGTCGAAAAACAAC^AAA^ 
GGATTTTTGGCGCATTTGAACCGACC^GAACCAGATC^^ 

CATCATTCATTTCCTAACCAATACGCACTCCCACCGGGATTTGAACAGCCTCAGTACAAA 
GTTGATAGATCATACAAGATGTCCATGGTCTACAACCAATATGTGGGATCCTCAAGCTCT 
AGCTTTGCAGGACTACAAAGTGACCCAAGTCAAGGT^TGAACCAGGATTGGACCTTTACC 
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GGGATCCCATTCCTACCCCAATCTCAACCTCAACCACTATCGTCACCAATATGTTTGGAT 
CTTTGCCTTGGCATTGGTAGCTCCCAAACCCAACCACAACCTCAAGAACCAAA 
ACAGAAGAGATGGATGCTGAGAAAGAAAATGATGGTTCTTCCCTTTCTCTCTCACTCAAA 
CTGTGA 

>G1477 Amino Acid Sequence (domain in AA coordinates: 29-48) 
MLS SDSNYASDI SDDASATGS I ENP I YKCKYCPRKFDKTQALGGHQNAHRKEREVEKQQK 
AFLAHLNRPEPDLYAYSYSYHHSFPNQYALPPGFEQPQYKVDRSYKMSMVYNQYVGSSSS 
S FAGLQSDPSQGMNQDWTFTGI PFLPQSQPQPLS S P I CLDLCLG IGS S QTQPQPQEPNDA 
TEEMDAEKENDGSSIiSLSLKL* 
>G1487 (1..1020) 

ATGGAACAAGCCGCGTTGAAGAGCAGCGTCAGGAAAGAGATGGCTCTCAAAACGACTTCT 

CCGGTTTACGAAGAGTTTCTTGCCGTCACCACCGCTCAAAATGGCTTTTCCGTCGACGAT 

TTCTCTGTAGACGACTTGCTTGACTTGTCAAACGATGACGTTTTTGCCGACGAAGAAACT 

GACCTCAAGGCTCAACATGAGATGGTCCGTGTTTCCTCTGAGGAACCCAACGACGACGGA 

GACGCTCTTCGCCGGAGCAGCGATTTCTCCGGCTGTGACGACTTTGGTTCTCTCCCTACA 

AGCGAACTCTCTCTTCCGGCGGATGATTTAGCGAACCTTGAGTGGCTCTCTCATTTCGTG 

GAGGACTCCTTCACGGAATATTCGGGTCCAAACCTCACCGGAACCCCGACTGAGAAACCG 

GCGTGGTTAACGGGTGACCGGAAACATCCTGTGACTGCAGTCACGGAAGAGACCTGTTTC 

AAATCCCCTGTTCCGGCTAAAGCCCGTAGCAAACGTAACCGCAATGGCCTCAAGGTCTGG 

TCGCTTGGTTCGTCGTCCTCCTCGGGTCCTTCCTCGTCCGGTTCGACCTCOTCCTCCTCT 

TCGGGTCCTTCCAGCCCGTGGTTCTCCGGCGCTGAGCTGCTCGAGCCTGTGGTCACGTCA 

GAGAGGCCACCGTTTCCCAAGAAGCATAAGAAAAGGTCAGCCGAGTCTGTTTTCTCCGGT 

GAGCTGCAGCAGCTGCAACCTCAGCGAAAGTGCAGCCACTGCGGCGTTCAGAAAACTCCG 

CAGTGGAGAGCCGGGCCAATGGGAGCCAAGACCCTGTGCAATGCGTGCGGTGTCCGGTAC 

AAGTCGGGTAGGTTGCTACCGGAATACAGACCCGCTTGTAGCCCGACATTCTCGAGTGAG 

CTGCACTCGAACCACCACCGGAAAGTCATAGAGATGAGGCGGAAGAAGGAGCCAACCAGT 

GACAACGAAACCGGTTTAAACCAGCTGGTTCAGTCCCCACAAGCTGTACCAAGTTT^ 

>G1487 Amino Acid Sequence (domain in AA coordinates : 251-276), 

MEQAALKS S VRKEMALKTTS PVYEEFLAVTTAQNGFS VDDFS VDDIjLDLSNDDVFADEET 

DLKAQHEMVRVSSEEPNDDGDALRRSSDFSGCDDFGSLPTSEIjSLP 

EDS FTE YSGPNIjTGTPTEKPAWLTGDRKHPVTAVTEETCFKS pvpakars krnrnglkvw 

slgsssssgpsssgstsssssgpsspwfsgaellepvvtserppfpkkhkkrsaesvfsg 
elck^lqpqrkcshcgvqktpqwragpmgaktlcnacgvryksgrllpeyrpacsptfsse 
lhsnhhrkviemrrkkeptsdnetglnqlvqspqavpsf* 

>G1492 (149.. 919) 

AATCCCAACCC^C^CACCTCTC^VATCCT^ 

CAGAACCAAAACATATCAAACCTTTTTTTCTCTTGGGTTTAAGTAAAAA 

AAGCTXTAACGGCAATAAATTTCACGGAGTTAGACCTTACGTACGGTCTCCAGTTCCACG 
GCTTAGATGGACGCCGGATCTTCACCGTTGTTTCGTTCACGCCGTCGAGATTCTCGGTGG 
TCAACACCGAGCAACACCAAAACTTGTTCTTAAGATGATGGATGTGAAGGGACTTACCAT 
TTCACATGTCA7U\AGCCACCTTCAGATGTATAGAGGAGGTTCAAAGCTCACTTTGGAGAA 
ACCAGAAGAAAGCTCATCATCTTCAATAAGAAGAAGACAAGACAGTGAAGAAGATTATTA 
TCTTCATGACAACTTGTCT'TTAC^CACAAGGAATGATTGTCTTTTGGGTTTTCACTCTTT 
TCCTCTTTCTTCACATTCTTGATTTAG 

GACTTCAGAGTCTGGTGGTTATGATGATGATGCTGACTTTCTTCACATCAAGAAGATGAA 

CGATACGACGACGTTTTTGTCACATCATTTCCCCAAGGGAACAGAGGAGTGGCGGGAACA 

AGAACACGAAGAAGAAGAAGAAGATTTGTCGTTGTCTCTGTCGTTAAATCATCATCATTG 

GAGAAGCAATGGATCATCGGTGGTGAGCGAAACGAGTGAAGCAG(^GTCTCGACTTGTTC 

AGCACCATTCGTATCeAAAGATTGCTTTGGTTCTTCAAAGATTGATCTTAATCTC^ 

TTCTCTCCTCGGTAGCTAAATAAGTTATGCAAGATTTAGGTTCAGAGAAACTATTCGGAT 

GTGTTTTTGAAACTAGGATATTGAATGTTAGTAGAGAAACCTAGAAAATGAAGTTTAGAT 

AAATTATCAACGCAGCGTTTTGATCGCCTTTGAACGGAAAATTAACAAA 

>G1492 Amino Acid Sequence (dpmain in AA coordinates: 34-83) 

MGKS SGRNGNGS FNGNKFHGVRPYVRS PVPRIjRWTPDIjHRCFVHAVE IIjGGQHRATPKLV 

lkmmdvkgltishvkshlqmyrggskl^ 

rndcllgfhsfplsshssfrgggggrtkeqqtsesggydddadflhikkmndtttflshh 
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FPKGTEEWREQEHEEEEEDLSLSLSLNHHHWRSNGSSVVSETSEAAVSTCSAPFVSKDCF 
GS S KIDIiNLS I SliLGS * 
>G1531 (1..666) 

ATGTGTGAGTCAAGCAACAAAGTCAGAGTATCGCCATACCCGCTTCGGTCTTCGAGGACC 
GACAAACACAAGGCGTCAGAGTCGCCTATTGAGACAGGTTGGGAGGATGTGCGTGGATGT 
CATCCTTACATGTGCGATACGAGTGTTCGTCACTCCAATTGTTTCAAGCAGTTCCGCAGA 
AAAACCATAAAAAAGCGCCTATACCCCAAGACCTTACATTGTCCTCTCTGTAGAGGTGAA 
GTATCCGAGACGACAAAGGTGACGAGCACTGCAAGAAGATTTATGAATGCTAAACCGAGG 
TCTTGCTCCGTAGAGGATTGCAAATTCTCTGGGACGT^ 

AAAACTGAGCATCGCGGTATTGTGCCACCAAAGGTCGATCCACTGAGACAACAGAGATGG 
GAAATGATGGAGAGACATTCTGAATACGTTGAACTCATGACTGCAGCTGGGATTTCGCGT 
ATGGCTGAGGTGATGCAACAACAGCTTCCCCAGGATCAGAAT 

GTGACCGTTAATGGAACCATATGGAATCTAATTGATCCGAGTCAGGGAAGGAATGGATTA 
GGCATCAC CAACTATAGCGCAATG CAGTTTGTACCATTAAG CATAAATCACAGTAGAACT 
CTGTGA 

>G1531 Amino Acid Sequence (domain in AA coordinates: 41-77) 

MCES SNKVRVSPYPLRSSRTDKHKASE S PI ETGWEDVRGCHPYMCDTS VRHSNCFKQFRR 

KTIKKRIiYPKTLHCPLCRGEVSETTKVTST^^ 

KTEHRGIVPPKVDPLRQQRWEMMERHSEYVEL^ 

VTVNGTIWNIiIDPSQGRNGLGITNYSAMQFVPLSINHSRTL* 

>G1540 (122.. 997) 

atctctttactaccagcaagttgttttcttgctaacttcaaacttctctttctcttgttc 
ctctctaagtcttgatcttatttaccgttaactttgtgaacaaaagtcgaatcaaacaca 
catggagccgccacagcatcagcatcatcatcatcaagccgaccaagaaagcggcaacaa 
caacaacaagtccggctctggtggttacacgtgtcgccagaccagcacgaggtggacacc 
gacgacggagcaaatcaaaatcctcaaagaactttactacaacaatgcaatccggtcacc 
aacagccgatcagatccagaagatcactgcaaggctgagacagttcggaaagattgaggg 
caagaacgtcttttactggttccagaaccataaggctcgtgagcgtcagaagaagagatt 
caacggaacaaacatgaccacaccatcttcatcacccaactcggttatgatggcggctaa 
cgatcattatcatcctctacttcaccatcatcacggtgttcccatgcagagacctgctaa 
ttccgtcaacgttaaacttaaccaagaccatcatctctatcatcataacaagccatatcc 
cagcttcaataacgggaatttaaatcatgcaagctcaggtactgaatgtggtgttgttaa 
tgcttctaatggctacatgagtagccatgtctatggatctatggaacaagactgttctat 
gaattacaacaacgtaggtggaggatgggcaaacatggatcat cat tact catctgcacc 
ttacaacttcttcgatagagcaaagcctctgtttggtctagaaggtcatcaagacgaaga 
agaatgtggtggcgatgcttatctggaacatcgacgtacgcttcctctcttccctatgca 
cggtgaagatcacatcaacggtggtagtggtgccatctggaagtatggccaatcggaagt 
tcgcccttgcgcttctcttgagctacgtctgaactagctcttacgccggtgtcgctcggg 
attaaagctctttcctctctctctctctttcgtactcgtatgttcacaactatgcttcgc 
tagtgattaatgatgcagttgttatattagtagttaactagttatctctcgttatgtgta 
atttgtaattactagctaagtatcgtctaggtttaattgtaattgacaaccgtttatctc 
tatgatgaataagttaaatttatatat 

>G1540 Amino Acid Sequence (domain in AA coordinates: 35-98) 
MEPPQHQHHHHQADQESGNNNNKSGSGGYTCRQTSTO 

TADQ IQKITARLRQFGKI EGKNVFYWFQNHKARERQKKRFNGTNMTTPS S S PNSVMMAAN 
DHYHPLLHHHHGVPMQRPANS VNVKXNQDHHLYHHNKPYPS FNNGNLNHAS SGTECGWN 
ASNGYMS SHVYGSMEQDC SMNYNNVGGGWANMDHHYS SAP YNFFDRAKPLFGLEGHQDEE 
ECGGDAYLEHRRTIiPLFPMHGEDHINGGSGAIWKYGQSEVRPCAJSLELRLN* 
>G1544 (1..2178) 

ATGTCTCAGTCAAACATGGTACCAGTGGCTAACAACGGAGACAACAACAACGACAACGAA 
AACAACAAC AACAACAACAACAATGGTGGAACrTGACAA CTGGAAATGATT CT 

GGAGATCAAGATTTCGACAGTGGGAATACCTCAAGTGGCAATCATGGAGAAGGGTTGGGA 
AACAATCAAGCTCCTCGTCATAAGAAGAAAAAATACAAT^ 

TCGGAGATGGAAGCTTTCTTCAGAGAGTGTCCTCACCCAGATGACAAACAAAGGTACGAC 
CTTAGCGCTCAATTGGGATTGGACCCTGTTCAG 

ACTCAAAACAAGAATCAACAAGAACGCTTTGAGAACTCAGAACTTCGGAATCTGAACAAC 
GACCTTAGGTCTGAAAATCAGCGGTTACGAGAAGCTATTCATCAAGCCTTATGCCCTAAG 
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TGTGGAGGCCAAACTGCAATTGGCGAAATGACCTTCGAAGAGCACCATCTTCGCATCCTC 
AACGCTCGTTTGACTGAAGAGATCAAGCAACTTTCCGTGACAGCGGAAAAGATATCAAGG 
CTTACGGGGATACCAGTAAGGAGCCATCCCCGTGTGTCTCCTCCTAATCCTCCTCCAAAT 
TTCGAGTTCGGGATGGGATCTAAGGGAAATGTCGGAAACCACTCGAGGGAAACCACTGGA 
CCTGCAGATGCTAATACCAAGCCGATCATCATGGAGTTGGCATTTGGAGCCATGGAGGAG 
CTCTTGGTGATGGCTCAAGTGGCTGAACCACTGTGGATGGGAGGATTTAATGGCACTAGC 
TTAGCTTTGAACTTGGATGAATACGAAAAGACGTTTCGCACGGGTCTCGGTCCTAGACTT 
GGCGGGTTTCGAACCGAGGCATCCAGGGAAACTGCACTCGTGGCAATGTGTCCTACTGGC 
ATTGTTGAAATGCTCATGCAAGAGAATCTGTGGTCAACAATGTTTGCCGGAATTGTTGGT 
AGAGCCAGGACTCATGAACAGATAATGGCTGATGCTGCTGGAAACTTCAATGGAAATCTC 
CAAATAATGAGTGCTGAGTACCAAGTGCTTTCCCCGCTAGTCACAACCCGCGAAAGCTAC 
TTCGTCCGCTACTGTAAGCAACAAGGAGAGGGTTTGTGGGCGGTGGTCGATATTTCCATC 
GACCATCTCCTCCCAAACATCAACCTAAAATGTCGCCGCCGACCCTCTGGATGTCTGATT 
CAAGAAATGCATAGTGGTTACTCCAAGGTTACATGGGTGGAACATGTGGAAGTAGATGAT 
GCAGGAAGTTACAGCATCTTTGAGAAATTAATCTGTACTGGTCAAGCTU ,, ri'GCTGCTAAC 
CGCTGGGTTGGTACATTGGTACGCCAGTGTGAGCGGATATCTAGCATCTTGTCGACAGAT 
TTTCAATCTGTCGATTCCGGTGATCACATAACGCTAACTAACCATGGAAAGATGAGCATG 
CTGAAGATAGCTGAGCGGATTGCGAGAACCTTCTTTGCTGGAATGACCAATGCGACGGGG 
TCTACAATATTTTCTGGTGTTGAAGGAGAAGATATCAGAGTGATGACAATGAAGAGCGTG 
AATGATCCAGGAAAGCCTCCCGGTGTCATTATTTGTGCAGCCACTTCCTTTTGGCTTCCT 
GCTCCTCCTAACACTGTCTTTGACTTCCTCAGAGAGGCTACTCACCGACACAATTGGGAT 
GTTCTCTGCAACGGAGAGATGATGCACAAGATAGCAGAGATTACGAATGGGATAGACAAA 
AGGAACTGTGCAAGTTTACTCCGGCATGGACACACTAGCAAGAGCAAGATGATGATAGTT 
CAAGAGACTTCTACTGACCCAACAGCTTCATTTGTGCTTTATGCGCCTGTTGATATGACA 
TCAATGGATATTACTCTCCATGGAGGTGGTGATCCTGACTTTGTGGTGATCCTGCCTTCT 
GGTTTTGCTATTTTTCCAGATGGTACGGGTAAGCCTGGAGGAAAAGAAGGAGGATCACTT 
TTGACCATTTCCTTCCAAATGCTGGTTGAGTCAGGTCCTGAGGCTAGGCTGAGTGTTAGC 
TCTGTTGCAACTACTGAGAATCTGATTCGTACAACCGTGCGGAGGATCAAAGATTTGTTT 
CCTTGTCAGACTGCTTGA 

>G1544 Amino Acid Sequence (domain in AA coordinates: 64-124) 
MSQSNMVWAISTNG^ 

NflQAPRHKKKKYNRHTQLQ I SEMEAFFRECPHPDDKQRYDLS AQLGLiDPVQ I KFWFQNKR 
TQNKNQQERFENSELKNLNMILRSENQRLREM 

NARLTEEIKQLSVTAEKISRIjTGIPVRSHPRVSPPNPPPNFEFGMGSKGNVGNHSRETTG 
PADANTKP I IMELAFGAMEELLVMAQVAEPLWMGGFNGTSLAI^ 

GGFRTEASRSTALVAMCPTGIVEMLMQENLWSTMFAGIVGRARTHEQIMADAAGNFNGNL 
QIMSAEYQVLSPLVTTRESYFVRYCKQQGEGLWAVVDISIDHLIjPNINLKCRRRPSGCLI 
QEMHSGYSKVTWVEHVEVDDAGSYS I FEKLICTGQAFAANRWVGTIiVRQCERISS ILSTD 

FQSVDSGDH itltnhgkmsmlkiaeri artffagmtnatgs ti fsgveged irvmtmksv 

NDPGKPPGVI I CAATS FWLPAPPNTVFDFLREATHRHNWDVLCNGEMMHKI AE ITNG IDK 
RNCASLLRHGHTSKSKMMIVQETSTDPTASFVIjYAPVDMTSMDITLHGGGDPDFVVILPS 
GFAI FPDGTGKPGGKEGGS IaLT I S FQMLVESGPEARIiS VS SVATTENL IRTTVRRI KDLF 
PCQTA* 

>G156 (39.. 755) 

AGGAAGAGGGAGCCACTCATAAGAGGAAGAAGAGAGAGATGGGTAGAGGGAAGATAGAGA 
TAAAGAAGATAGAGAATCAGACGGCGAGGCAAGTGACCTTCTCCAAGAGAAGAACTGGTC 
TTATAAAGAAGACTCGTGAGCTCTCTATTCTCTGTGACGCTCACATCGGTCTCATCGTCT 
TCTCAGCCACCGGAAAGCTTTCCGAGTTCTGCTCCGAACAGAACAGGATGCCTCAACTCA 
TTGACCGATACTTGCATACCAACGGATTGCGACTTCCTGATCATCATGACGACCAGGAGC 
AATTGCACCATGAGATGGAACTACTAAGAAGAGAGACATGTAACCTTGAGCTTCGTCTGC 
GTCCATTCCATGGACATGACTTAGCCTCCATTCCTCCTAATGAGCTTGACGGACTCGAGA 
GACAGCTAGAACATTCTGTCCTCAAAGTCCGTGAGCGTAAGAGGAGGATGCTAGAAGAAG 
ATAACAACAACATGTACCGTTGGCTTC^TGAG^^ 

CTGGGATAGATACCAAACCAGGGGAGTATCAACAGTTTATAGAGCAGCTTCAGTGCTATA 
AACCAGGGGAGTATCAGCAGTTTCTAGAGCAGCAGCAACAACAACCAAACAGCGTTCTTC 
AGCTTGCTACACTTCCTTCTGAGATTGATCCTACTTACAATCTCCAGCTTGCTCAG 
ATCTTCAAAACGATCCAACGGCCCAGAATGATTAATACAATTCTCAATAGATATCTACTC 
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TTTCTTTATGGAGACAGATTCATGAACTTTTATTACCTATATTTTGATAAGCCAGTGTCT 

TCTTTTGTGTGGCTATGGAAACCTTGTTTAAAGCACAATGCACTTGAGTTCTTGGTTATA 

TAATTAATCATCATTATTACATANWAAANAAiNNAAAAAAAAAAAAAA 

>G156 Amino Acid Sequence (domain in AA coordinates: 2-57) 

MGRGKIEIKKIENQTARQVTFSKRRTGLIKKTRELSILCDAHIGLIVFSATGKLSEFCSE 

QNRMPQLIDRYLHTNGLRLPDHHDDQEQIiHHEira^ 

NEIiDGLERQLEHSVLKraERKRRmEEDSrN^^ 

IEQLQCYKPGEYQQFLEQQQQQPNSVIjQIiATLPSEIDPT^ 

>G1584 (160. .1281) 

ATTCACATTTTTATTTATCTTTCCATTTAGCCATT 
TTTTTGACACATCACATGATCATCACATC^ 

ACACATACATCTGTGTTCTGCGGATCGAGTTAATTAGTTATGGCTTCTTCGAATAGACAC 
TGGCCAAGCATGTTC^GTCCAAACCTC^TCCC 

CCTCTCTTGCCTTCTGCTTCTCACCGATCTTCTCCTTTCTCTTCAGGATGTGAAGTGGAG 

AGGAGTCCAGAGCCAAAACCAAGATGGAATCCAAAGCCAGAGCAGATTCG 

GCAATCTTTAACTCCGGGATGGTGAATCCT 

GGCCAAGTCGGTGATGCTAACGTCTTCTACTGGTTCCAAAACCGTAAGTCCCGTAGTAAA 
CACAAACTCCGCCTCCTCC^CAACCAC^ 

CCG(^GCCGC^UVCCTTCGGCITCCTCTTCCTCTTCCTCCTCCTCTTCCTCOT 

ACCAAACCCCGAAAAAGCAAGAACAAGAACAACACTAATCTCTCrrTTGGGT^ 

ATGATGGGGATGTTTCCACCGGAACCGGCGTTTCTCTTCCCGGTCTCCACTGTCGGAGGG 

TTTGAAGGTATCACC^TCTCATCCCAATTAGGGTTTCTCTCCGGTGATATGATTGAGCAA 

CAAAAACCGGCTCCAACGTGTACCGGACTCCTGCTGAGTGAGATCATGAACGGTAGTGTG 

AGTTATGGAACTCATCATCAACAACACTTGAGTGAGAAAGAAGTTGAAGAAATGAGGATG 

AAGATGTTGCAACAGCCACAGACTCAGATTTGTTACGCT 

TCTTACAACAACAACAACAACAACAATAACATCATGCTTCATATTCCTCCCAC 

ACTGCC^CC^CTATTACTACTTCGCATTCTCTCGCTACTGTCCCATCAACTTCGGACCAG 

CTTCAAGTTCAAGCGGACGCACGAATAAGAGTTTTCATCAATGAA 

AGCTCAGGACCGTTCAATGTGAGGGATGCATTTGGGGAAGAGGTTGTTCTGATTAATTCC 
GCGGGTCAGCCC^TTGTCACCGATGAATATGGCGTCGCTCTTCACCCTCTTCAACACGGA 
GCCTCGTACTATCTGATCTAGTCGTGTGGGAGATTTGAGTTTGAAGAAGAAATTAAGACC 
TGTCTCTTTCTTTCACCATCTCTCGTACGTAGGCTTAAATGTTAAGATTTTATAAAGTAT 
TGGTTTCAGTTACCTGTTGTGACGGTGTTTATGTATGAGTTTCGGACAACATTCACAAAA 
CTCTCTCGTTAAATTGTTGACCAATAATATATGATGTGTGTTTCATTATTATCTAAAAAA 
AA 

>G1584 Amino Acid Sequence (domain in AA coordinates: TBD) 

MASSNRHWPSMFKSKPHPHQWQHDINSPLLPSA5HRSSPFSSGCEVERSPEPKPRWPKP 

EQIRILEAIFNSGMVNPPREEIRIjQEYGQVGDAiTVFYWFQira 

LPQTQPQPQPQPSASSSSSSSSSSSKSTKPRKSKNKNNTNLSLGGSQMMGMFPPEPAFLF 
PVSTVGGFEGITVSSQIiGFLSGDMIEQQKPAPTCTGLLLSEIMNGSVSYGTHHQQHLSEK 
EVEEMRMKMLQQPQTQICYATTNHQIASYN^ 

VPSTSDQLQVQADARIRVF INEMELEVS SGPFNVRDAFGEEWL INS AGQP I VTDE YGVA 
LHPLQHGAS YYLI * 
>G1587 (1..816) 

ATGGGCTACATCTCCAACZAACAACCTCATCAACTATTTGCCCCTCTCTACTACTCAACCT 
CCTCTTCTTCTCACCCACTGTGATATTAACGGCAATGATCACCIATCAGCTCATAACCGCA 
TCATCAGGAGAACACGATATTGATGAACGGAAAAACAACATTCCTGCGGCGGCGACTTTG 
AGATGGAATCCGACGCCAGAGCAGATCACGACGCTAGAAGAGCTTTACAGAAGCGGAACA 
CGGACGCCGACGACGGAACAGATCCAACAGATAGCATCTAAGCTTCGTAAATATGGGAGA 
ATCGAAGGGAAGAACGTTTTCTATTGGTTTCAGAATCATAAGGCTAGAGAGAGACTAAAA 
CGCCGCCGTCGTGAAGGTGGTGCTATTATC^AACCAC^TAAAGACGTCAAGGATTCAT^ 
TCAGGTGGTCATCGAGTTGATCAGACAAAGCTCTGCCCATCTTTTCCACACACAAACCGA 
CCAGAGCCAGAGCATGAATTAGATCCTGCGAGTTA 

GAAGATCATGGGACGACTGAAGAATCTGATCAGAGGGCATCAGAGGTTGGTAAATACGCC 
ACATGGAGAAATCTTGTTACTTGGTCGATAACTCAACAACCGGAAGAGATTAATATCGAC 
GAAAATGTCAACGGAGT^GAAGAAGAAACGAGGGACAACCGGACTTTAAATCTCTTTCCG 
GTTAGGGAGTACCAAGAGAAAACAGGCCGGTTGATAGAGAAGACGAAAGCATGCAACTAC 
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TGTTACTACTACGAGTTCATGCCTCTGAAGAACTGA 

>G1587 Amino Acid Sequence (conserved domain in AA coordinates : 61-121) 

MGYISNNOTjINYLPLSTTQPPIjIjLTHOT 

RWNPTPEQITTLEELYRSGTRTPTTEQIQQ 

RRRREGGAI I KPHKDVKDS S SGGHRVDQTKLCPS FPHTNRPQPQHEIjDPAS YNKDNNAN^ 
EDHGTTEE SDQRASEVGKYATWRNLVTWS ITQQPEE INIDENVNGEEEETRDNRTLNliF P 
VREYQEKTGRLIEKTKACNYCYYYEFMPLKN* 
>G1588 (1- .2232) 

ATGTACCATCCAAACATGTTTGAGAGCCATCATATGTTCGATATGACCCCAAAGAGTACC 

TCTGATAACGACTTGGGAATCACCGGTAGCCGAGAAGATGACTTTGAGACCAAGTCAGGT 

ACCGAAGTCACTACTGAGAATCCTTCTGGTGAAGAGCTTCAAGATCCTAGCCAACGTCCC 

AACAAAAAGAAGCGTTACCATCGCCAC^CGC^CGCCAAATTCAAGAGCTCGAATCATTC 

TTTAAGGAATGTCCTCATCCAGATGATAACCAACGAAAAGAGTTGAGCCGTGATCTCAAT 

TTAGAGCCTCTTCAAGTTAAGTTTTGGTTCCAAAACAAACGCACACAGATGAAGGCACAA 

AGTGAGAGGCATGAGAACCAGATTCTAAAGTCAGACAATGACAAGCTCAGAGCAGAGAAC 

AATAGATACAAAGAAGCTCTAAGCAATGCTACATGCCCTAACTGTGGCGGTCCAGCTGCT 

ATTGGAGAAATGTCTTTTGACGAAGAACATCTCAGGATCGAAAATGCTCGGCTCCGCGAA 

GAGATTGATAGGATCTCTGCTATTGCTGCGAAATACGTTGGGAAGCCGTTAGGATCGTCT 

TTCGCTCCACTAGCGATCCACGCGCCTTCTCGTTCGCTTGATCTTGAAGTTGGAAACTTT 

GGGAACCAGACAGGCTTTGTAGGAGAAATGTATGGAAC^GGGGACATTTTGAGGTCAGTT 

TCGATTCCTTCTGAGACTGATAAGCCTATAATCGTGGAGCTAGCGGTTGCAGCTATGGAG 

GAACTCGTGAGAATGGCTCAAACTGGAGATCCTTTATGGCTTTCAACCGATAATTCAGTC 

GAGATTCTCAACGAAGAAGAGTATTTCAGAACGTTTCCGAGAGGAATTGGACCAAAGCCA 

TTAGGATTAAGATCAGAGGCGTCAAGACAATCrGCAGTTGTTATAATGAATCACATCAAT 

CTCGTTGAGATTCTC^TGGATGTGAATCAATGGTCTTGTGTTTTCTCTGGGATTGTGTCA 

AGAGCCTTGACACTTGAAGTTCTTTCAACTGGAGTTGCTGGGAACTACAACGGTGCTTTA 

CAAGTGATGACAGCTGAGTTTCAAGTTCCATC^ 

TTTGTGAGATACTGCAAACAACACAGTGACGGCTCTTGGGCTGTGGTTGATGTCTCTTTG 

GACAGCCTTAGACCAAGTACTCCAATCTTAAGAACTAGAAGAAGGCCTTCAGGTTGTCTG 

ATTCAAGAATTGCCTAATGGTTATTCTAAGGTTACATGGATAGAGCATATGGAGGTAGAT 

GATAGAT<^GTTCAC^CATGTATAAACCGTTGGTT(^GTCCGGTTTAGCTTTCGGTGCG 

AAACGTTGGGTGGCTACACTCGAACGACAATGCGAGCGGCTTGCTAGCTCCATGGCCAGC 

AACATTCCTGGTGATCTTTCCGTGATAACGAGTCCTGAAGGAAGGAAGAGTATGTTGAAG 

CTAGCTGAGAGAATGGTTATGAGTTTCTGCAGTGGTGTTGGCGCGTCGACTGCACACGCT 

TGGACAACAATGTCGACAACAGGATCCGATGATGTTCGGGTCATGACCCGCAAGAGTATG 

GATGATCCAGGAAGACCTCCGGGTATTGTTCTTAGTGCAGCTACTTCATTCTGGATCCCA 

GTTGCTCCCAAACGTGTTTTTGATTTCCTCCGTGACGAAAATTCAAGAAAAGAGTGGGAT 

ATTCTGTCAAATGGAGGTATGGTTCAGGAAATGGCTCATATAGCCAATGGTCATGAACCT 

GGAAACTGTGTCTCCTTGCTCCGAGTCAATAGTGGAAACTCGAGCCAGAGCAACATGTTG 

ATTCTACAAGAGAGCTGTACAGATGCATCAGGATCGTATGTGATTTACGCGCCAGTGGAT 

ATAGTGGCGATGAATGTGGTTCTAAGCGGTGGAGATCCTGATTACGTGGCGTTGTTGCCG 

TCTGGTTTTGCTATTTTACCGGATGGTTCGGTTGGAGGAGGAGATGGGAATCAGCATCAG 

GAAATGGTTTCTACTACTTCTTCTGGGAGTTGTGGTGGTTCGCTTTTAACCGTTGCGTTT 

CAGATTCTTGTTGACTCTGTTCCTACAGCTAAACTCTCACTTGGCTCGGTGGCTACGGTT 

AATAGTCTGATCAAATGTACGGTGGAGAGGATTAAAGCTGCTGTTTCTTGTGATGTTGGA 

GGAGGAGCGTAG 

>G1588 Amino Acid Sequence (domain in AA coordinates: 66-124) 
OTraPNMFESHHMFmTPKSTSDN^^ 

NKKKRYHRHTQRQIQELESFFKECPHPDDKQRKELSRDLNLEPLQVKFW 

S ERHENQ ILKSDNDKLRAENNRYKEAL SNATCPNCGGPAAIGEMS FDEQHLRI ENARLRE 

EIDRISAIAAKYVGKPLGSSFAPIAIHAPSRSIJDLEVGNFGNQTGFVGEMYGTGDILRSV 

S IPSETDKPI IVELAVAAMEELVRMAQTGDPLWLSTDNSVEII.NEEEYFRTFPRGIGPKP 

IX3LRSEASRQSAWIMNHINLVEILMDWQWSCVFSGIVSRAIiTLEVLS 

QVMTAEFQVPSPLVPTRENYFVRYCKQHSDGSWAVVDVSIJDSLRPSTPILRTRRRPSGCIj 

IQELPNGYSKVTWIEHMEV^DRSVHNMYKPLVQSGLAFGAKR 

NIPGDLSVITSPEGRKSMLKIiAERMVMSFCSGVGASTAHAWTTMSTTGSDDWVMTRKS 
DDPGRPPGIVLSAATSFWIPVAPKRVFDFLRBENSRKEWDILSNGGMVQEMAHIANGHEP 
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GNCVSLLRWSGNSSQSNI^ILQESCTDA^ 

SGFAILPDGSVGGGDGNQHQEMVSTTSSGSCGGSLLTVAFQILVDSVPTAKLSIiGSVATV 
NSLI KCTVERI KAAVS CDVGGGA* 
>G1589 (179. .2221) 

ACCAAACTCACATAGCAATCACACACATCrrCCACAAACACAGCTTGAGATGATCATGAAA 
CACGTGCATCCTCAGATCTCTATCAATCCAGCTTGGTGAAAGAAGGTCAAGAATTGAAAG 
AGAATCAAAGAAAACGACGTCGTTTCATTCGTGTGTAACAACTACTAATTATACATAGAT 
GGCTGCTTACTTTCACGGAAACCCACCGGAGATCTCTGCCGGATCCGACGGTGGTCTTCA 
AACGTTGATCCTC^TGAATCCAACTACTTACGTTCAGTACACCCAACAAGACAACGACTC 
GAACAAGAACAACAAGAGCAACAATAGCAACAAC^ 

CAAGAACAGTAGTTTCGTTTTCCTCGATTCCCACGCGCCGCAGCCAAACGCGAGCCAGCA 
GTTCGTCGGAATACCACTCTCAGGTC^CGAAGCTGCTTCCATTACAGCCGCCGACAACAT 
CTCCGTACTTCACGGTTATCCTCCGCGCGTGCAGTACAGTCTCTACGGTAGCCACCAAGT 
GGATCCC^CTC^CCAGCAAGCCGCGTGTGAGACGCCACGCGCGCAGCAAGGCCTCTCTTT 
AACCCTCTCGTCTCAACAGCAGCAGCAACAGCAAC71TCATCAACA 

CGTCGGATTCGGGTCCGGACATGGAGAAGATATCCGGGTCGGGTCTGGCTCTACAGGATC 

GGGGGTAACTU^CGGTATAGCTAATCTTGTTAGCTCCAAGTACTTGAAGGCAGCACAA 

GCTTCTTGACGAAGTAGTCAACGCTGATTCCGATGAC^TGAACGCTAAATCCCAACTATT 

CTCATCGAAAAAGGGTAGTTGCGGAAATGATAAACCTGTCGGAGAATCATCGGCCGGCGC 

TGGAGGAGAAGGTTCCGGTGGCGGAGCAGAAGCAGCCGGGAAACGTCCGGTGGAGCTAGG 

CACGGC^GAGAGACAAGAAATACAGATGAAGAAAGCAAAACTTAGTAACATGCTTCATGA 

GGTGGAGCAGAGATATAGACAGTACCACCAGCAGATGCAGATGGTGATCTCTTCGTTCGA 

GCAAGCGGCAGGGATAGGATCAGCGAAGTCATACACGTCGCTAGCATTGAAAACCATATC 

AAGACAGTTCCGTTGCTTGAAAGAGGCGATCGCTGGTCAGATAAAAGCGGCCAACAAGAG 

TCTTGGGGAGGAAGATTCAGTGTCTGGTGTTGGGAGGTTTGAGGGGTCGAGGCTCAAGTT 

CGTGGACCACCACTTGAGACAGCAAAGAGCTCTTCAACAACTGGGAATGATTCA 

TTCCAATAATGCTTGGAGACCTC7VACGTGGTCTCCCAGAACGAGCCGTCTCAGTTCTCCG 

TGCTTGGCTCTTCGAACACTTTCTTCATCCATACCCTAAGGATTCGGACAAGCACATGCT 

AGCTAAGCAAACAGGACTCACTCGTAGGCAGGTGTCGAACTGGTTTATAAACGCGAGAGT 

TCGGTTATGGAAACCAATGGTGGAGGAGATGTACATGGAGGAAATGAAGGAGCAGGCAAA 

GAACATGGGATCCATGGAAAAGACTCCTTTGGATCAAAGCAACGAAGATTCTGCTTCAAA 

GTCAACAAGTAACCAAGAAAAGAGCCCAATGGCGGACACTAATTACCATATGAATCCCAA 

TCACAACGGTGACCTAGAAGGCGTCACTGGAATGCAAGGATGCCCCAAGAGACTAAGAAC 

CAGCGACGAGACAATGATGCAGCCAATAAATGCGGATTTCAGCTCCAACGAGAAGCTCAC 

GATGAAGATTCTAGAAGAACGGCAAGGGATAAGATCAGATGGTGGCTACCCTTTCATGGG 

TAATTTCGGGCAATACCAAATGGATGAGATGTCAAGATTTGATGTAGTCTCAGACCAGGA 

GCTCATGGCGCAAAGGTACTCAGGAAACAACAATGGCGTGTCCCTC^CGTTAGGTTTACC 

TCATTGTGATAGCTTGTCGTCCACGGACCATCAGGGTTTCATGCAGACCCACCATGGGAT 

TCCTATAGGGAGAAGAGTGAAAATAGGAGAAACAGAGGAATATGGACCCGCCACCATCAA 

TGGTGGTAGCTCGACCACAACCGCACATTCATCAGCGGCAGCTGCCGCG 

GATGAACATACAGAACCAGAAGAGATATGTGGCTCAGTTATTGCCCGACTTCGTTGCATA 

AACCCATCTCTCTAGAAGGAGAAACCGAAACAGGTTATTATATACGTTTCTAGTTTTTAA 

TTAGTATATAGTTTCTCATACGATTGAACCAAAAC^ 

TTGGTTATATATGGCCGACGGGCTACGTCAGGGCCCTGACGTAGC 

>G1589 Amino Acid Sequence (conserved domain in AA coordinates : 384-448) 
MAAYFHGNPPEISAGSDGGLQTLILMNPTTYVQY 

NI^SSFVTIjDSHAPQPNASQQFVGIPLSGHEAASITAADNISVLHGYPPRV 
VDPTHQQAACETPRAQQGLS LTLS S QQQQQQQHHQQHQP IHVGFGSGHGED IRVGSGSTG 
SGVTNGIANIiVSSKYLKAAQELLDEVVNADS 

AGGEGSGGGAEAAGKRPVEIiGTAERQE I QMKKAKLSNKLHEVEQRYRQ YHQQMQMVI S S F 
EQAAGIGSAKSYTSIaALKTISRQFRCLKEAIAGQIKAANKSLGEEDSVSGVGRFEGSRLK 

fvdhhlrqqra3x3qlgmiqhps^awrpqrglperavsvij^wlfehflhpypkd 
lakqtgltrsqvsnwfinarvrlw 

KSTSNQEKSPMADTNYHMNPNHNGDLEGVTGMQGCT 

TMKILEERQGIRSDGGYPFMGNFGQYQMDEMSRFDVVSDQELMAQRYSGNNNGVSLTLGL 
PHCDSLS STDHQGFMQTHHGI P IGRRVKI GETEEYGPATINGGS STTTAHS S AAAAAAYN 
GMNIQNQKRYVAQLIiPDFVA* 



86 



WO 03/013227 



87/286 



PCT/US02/25805 



>G160 (38.. 784) 

TCAAATTTGTCATTTGTTTATTCAAATTTTTGAGAAAATGGTGAGAAGTACCAAAGGTCG 

TCAGAAAATAGAGATGAAAAAAATGGAAAACGAAAGCAACCTTGAGGTTACTTTCTCAAA 

AAGAAGATTCGGTCTTTTCAAAAT^GCTAGTGAACTTTGCACATTAAGTGGTGCAGAGAT 

TCTGTTGATTGTGTTCTCTC<^GGTGGGAAAGTGTTTTC 

AGAACTGATTC^TCGCTTTTCGAATCCTAAC 

CAACAATCTCCAACTTGTTGAAACCCGTCCGGATAGAAAT^ 

ACTCACTGAGGTGCTGGCAAACCAGGAAAAGGAGAAAC^GAAGAGAATGGTTTTGGACCT 
ATTGAAAGAATCCAGAGAACAAGTAGGAAACTGGTATGAAAAAGATGTGAAAGATCTCGA 
CATGAATGAAACCAACCAGCTGATATCTGCTCTTCAAGATGTGAAAAAGAAACTGGTAAG 
AGAAATGTCTCAATATTCTCAAGTAAATGTTTC 

CGTGATTGGTGGTGGTAATGTTGGCATTGATCTTTTTGATCAAAGAAGAAATGCATTCAA 

C^ATAATCCAAACATGGTGTTTCCCAATCATACACCACCAATGTTTGGATACAA 

TGGAGTTCTCGTTCCGATATCCAACATGAACTACATGTCAAGTTACAACTTCAACCAGAG 

CTAGAGTCTGAAGCTAGAAGAACATCCTAATCAATATTTGCGTTATTTTGGCTATGGTTA 

CTGTTAGGATTGTTCTTGTATTGTGAGACTTAAGTT^ 

GTTGGTTGGTTTTTCATTTTATTCGTCG 

TCCCAGAATAAATTTATTTATCCTTTAAAAA 

>G160 Amino Acid Sequence (domain in AA coordinates: 7-62) 

MVRSTKGRQKIEMKKMENE SNLQVTFS KRRFGIiFKKASEIiCTIiSGAE I LIiI VFS PGGKVF 

S FGHPSVQELIHRFSNP1JHNSAI VHHQNNNLQLVETRPDRNI QYLNNIIjTEVIjANQEKEK 

QKRMVLDLIiKESREQVGNWYEKDVKDLDMNETNQL 

NYFGQSSGVIGGGNVGIDLFDQRRNAFNYNPN^^ 

SSYNFNQS* 

>G1636 (19.. 666) 

GAGTAATCATCAACGATTATGGCGTCAAGTCAGTGGACGAGGTCGGAGGATAAGATGTTT 
GAGCAAGCTTTGGTTCTTTTTCCTGAAGGATCTCCTAATCGGTGGGAGAGAATCGCTGAT 
CAGCTTCATAAATCTGCTGGTGAAGTTAGGGAGCATTACGAGGTCTTGGTTCATGATGTT 
TTCGAGATTGATTCTGGTCGAGTTGATGTCCCTGATTACATGGATGACTCGGCGGCTGCG 
GCGGCGGGTTGGGATTCCGCTGGTCAGATCTCTTTTGGGTCTAAACATGGCGAGAGTGAA 
CGCT^AAAGAGGAACTCCTTGGACAGAGAACGAACACAAATTGTTTCTGATCGGATTAAAG 
AGATATGGTAAGGGAGATTGGAGGAGTATCTCGAGAAACGTTGTGGTGACGAGGACACCG 
ACGCAAGTCGCGAGTCT^CGCTCAGAAGTATTTTCTGAGACAGAACTCGGTGAAGAAGGAG 
AGGAAAAGGTCGAGCATCCATGATATAACTACGGTTGATGCTACTTTGGCTATGCCTGGG 
TCTAAGATGGACTGGACTGGCCAACACGGGAGTCCTGTTCAGGCGCCGCAGCAGCAACAG 
ATTATGTCTGAGTTCGGTCAGCAATTGAATCCTGGTCATTTCGAGGATTTTGGGTTTCGG 
ATGTGATG 

>G1636 Amino Acid Sequence (domain in AA coordinates: 100-165) 
MAS S QWTRSEDKMFEQALVIjFPEGSPI^WERIADQLHKS AGEVREHYEVIjYHDVFE IDSG • 
RVDVPDYMDDS AAAAAGWD S AGQI S FGSKHGESERKRGTPWTENEHKLFL IGLKRYGKGD 
WRS I SRNVWTRTPTQVASHAQKYFIjRQNSVKKERKRSS IHDITTVDATIiAMPGSNMDWT 
GQHGSPVQAPQQQQIMSEFGQQLNPGHFEDFGFRM* 
>G1642 (1..1077) 

ATGGGTCATCACTCATGCTGCAAGAAGCAAAAGGTGAAGAGAGGGCTTTGGTC^ 

GAAGACGAAAAGCTCATCAACTACATCAATTCATATGGCCATGGATGTTGGAGCTCTGTT 

CCTAAACATGCAGGTTTGCAGAGATGTGGAAAGAGTTGTAGATTAAGATGGATAAATTAT 

CTAAGACCTGATCTTAAACGTGGAAGCTTCTCTCCTCAAGAAGCTGCTCTTATCATTGAG 

CTTC^CAGCATTCTTCGTAACAGA^ 

GATAACGAGGTCAAGAATTTCTGGAACTCGAGCATTAAAAAGAAGCTCATGTCTCACCAT 
CATCACGGTCATCATCATCATCATCTCTCTT^ 

TATCACAATGGATTCAACCCTACTACAGTCGACGATGAAAGTTCAAGATTCATGT 
ATCATCACAAACACTAACCCTAATTTCATCAC 

GATGTTATGACCCCATTGATGTTCCGAACCTCTAGAGAAGGAGATTTCAAGTTTCTAACC 
ACAAACAACCCAAACGAATCTCATCACCATGATAAT^ 
TTGTCACCC^CACCAACTATAAACAATCATCATCAACCTTCACTTTCT^ 
GATAATAATCTCCAATGGCGAGCGTTACCAGATTTCCC^ 

CAAGAAACCCTTCAAGATTATGATGATGCTAATAAACTCAACGTGTTTGTGACACCATTC 
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AACGATAATGCCAAAAAGTTATTATGTGGAGAAGTTCTCGAAGGCAAAGTACTATCTTCC 

TCCTCACCAATTTCACAAGATC^CGGCCTTTTTCTTCCCACCACGTACAACTTTCAAATG 

ACTTCTACGAGTGATCATCAACATCATCATCGAGTGGACTCATACATCAATCACATGATC 

ATACCATCATCATCCTCATCGTCGCCAATCTCTTGTGGACAGTACGTCATAACTTAA 

>G1642 Amino Acid Sequence (domain in AA coordinates: TBD) 

MGHHSCCNKQKTORGLWSPEEDEKIiINYINSYGHGCWSSVPKHAGIiQ^ 

LRPDIiKRGSFSPQEAALI I ELHS ILGNRWAQIAKHLPGRTDNEVKNFWNSS IKKKLMSHH 

HHGHHHHHLSSMASLLTNLPYHNGFNPTTVDDESSRFMSNIITNTNPNFITPSHIjSLPSP 

HVMTPLMFPTSREGDFKFLTTNl^NQSHHHDHN^ 

DNNLQWPALPDFPASTISGFQETLQDYDDANKIJSrVFVTPFOT 

S S PI SQDHGLFLPTTYNFQMTSTSDHQHHHRVDS Y INHMI IPSSSSSSPIS CGQYVI T* 
>G1747 (1..777) 

ATGAAAATGATGCAAGAGGAGGGAAACCGAAAAGGTCCATGGACAGAAC^lGGAAGACATA 
CTTCTGGTAAATTTTGTTCACTTATTT^ 

TCAGGTTTGAACAGAA(^GGAAAGAGTTGCAGGCTAAGATGGGTTAATTACCTACA 

GGTCTC^AACGTGGC^UVGATGACGCCTCAAGAAGAGCGCCTCGTCCTTGAGCTTCACGCT 

AAGTGGGGAAACAGGTGGTCGAAAATAGCCCGAAAATTGCCGGGACGAACGGATAACGAG" 

ATAAAGAACJTACTGGAGGACTCATATGAGGAAGAAAGCTCAAGAAAAGAAGCGTCCTGTT 

TCCCCAACTTCCTCATTTTCCAACTGCAGCTCGTCATCTGTGACCACTACCACCACCAA 

ACTTCAAGATACATCGTGCCACTCGCGTAAATCTTCAGGGC^ 

GGAGGTTCCCGATCCACTAGAGAGATGAATCAAGAAAACGAAGACGTGTACTCGTTGGAT 
GATATATGGAGAGAGATTGATCACTCAGCAGTAAACATAATAAAACCGGTTAAAGACATC 
TACTCAGAACAAAGCCATTGCTTAAGTTACCCAAA 

TCATTGGATTCTATATGGAACATGGATGCAGATAAAAGTAAGATATCGTCTTACTTTGCA 
AATGATCA.GTTTCCTTTCTGTTTCCAACACAGTAGATCACCATGGTCGTCAGGTTAA 
>G1747 Amino Acid Sequence (domain in AA coordinates: 11-114) 
MKMMQEEGNRKGPWTEQED ILIiVNFVHIiFGDRRWDF I AKVSGIjNRTG^CRLRWVNYLHP 
GLKRGKMTPQEERLVIjELHAKWGNRWS KI ARKLPGRTDNE I KNYWRTHMRKKAQEKKRPV 
S PTS S FSNCS S SS VTTTTTNTQDTS CHSRKS SGEVS F YDTGGSRSTREMNQENED VYSLD 
DIWREIDHSAVNIIKPVKDIYSEQSHCLSYPNLASPSWESSLDSIWNMDADKSKISSYFA 
NDQFPFCFQHSRSPWSSG* 
>G1749 (59.. 535) 

GAACACTTCTCAGTGACCGTGAGCAACGAATTATTTTCAGTTCAACGACTCCGCGGAAAT 
GGAAAATTCAGAAAATGTTCCCTCTTACGATCfiAAACATCAATTTCACTCCTAATTTGAC 
GAGAGATCAAGAACATGTGATCATGGTCTCTGCTTTGCAAGAAGTAATATCCAACGTCGG 
AGGTGACACGAACTCGAATGCATGGGAAGCTGATCTTCCACCTTTGAACGCTGGCCCTTG 
TCCTCTTTGTAGTGTCACCGGCTGCTACGGTTGCGTCTTCCCACGACACGAGGCGATAAT 
TAAGAAGGAGAAGAAGCACAAAGGAGTGAGGAAAAAACCATCAGGTAAATGGGCGGCGGA 
GATATGGGATCCGAGTTTGAAAGTAAGGAGATGGCTTGGAACGTTTCCAACAGCGGAGAT 
GGCGGCTAAGGCTTACAACGATGCGGCGGCTGAGTTTGTCGGAAGAAGATCAGCAAGACG 
TGGCACAAAGAACGGAGAGGAAGCATCTACCAAGAAGACGACTGAGAAAAATTAACGGAG 
AAGGAGCACGTATAGAAAGGCAGGAAGAGGCATCTTACTTGCTTCACAAGTAAATGAGAA 
TTTTTTTGAAAAGTAAAAACGTTATTTTGTTTGGTAATAAAATAAAGTAAAAC^AAATAT 
TGCTAACGCAAGACTTATCAAGTTCAGTCGTGACTGTGAGTGTGTTTTTATGTATCTTAC 
TTCATTTTTTGTCTTTCAATTGTGTGTGTGTGTGT 

>G1749 Amino Acid Sequence (conserved domain in AA coordinates : 84-155) 

MENSENVPSYDQNINFTPin^TRDQEHVIMVSALQQVISNVGGDTNSNAWEADLPPrjNAGP 

CPLCSVTGCYGCVFPRHEAIIKKEKKHKGVRKKPSGKWAAEIWDPSLKV^ 

MAAKAYNDAAAEFVGRRSARRGTKNGEEASTKKTTEKN* 

>G1751 (117.. 923) 

AAACACAAACAAAACTGA.TATTTTCAATCTCCAGGTGCTTTACACCAACAGAGTC 
AAAACAAAAACCAAACTCGGATTTAGTTTGACAGAAGAAGGAATCGAGAGTCGGGTATGC 
ATTATCCTAACAACAGAACCGAATTCGTCGGAGCTCCAGCCCCAACCCGGTATCAAAAGG 
AGCAGTTGTCACCGGAGCAAGAGCTTTCAGTTATTGTCTCTGCT 

CAGGGGAAAACGAAACGGCGCCGTGTCAGGGTTTTTCCAGTGACAGCACAGTGATAAGCG 
CGGGAATGCCTCGGTTGGATTCAGACACTTGTCAAGTCTGTAGGATCGAAGGATGTCTCG 
GCTGTAACTACTTTTTCGCGCCAAATCAGAGAATTGAAAAGAATCATCAACAAGAAGAAG 
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AGATTACTAGTAGTAGTAACAGAAGAAGAGAGAGCTCTCCCGTGGCGAAGAAAGCGGAAG 

GTGGCGGGAAAATCAGGAAGAGGAAGAACAAGAAGAATGGTTACAGAGGAGTTAGGCAAA 

GACCTTGGGGAAAATTTGCAGCTGAGATCAGAGATCCTAAAAGAGCCACACGTGTTTGGC 

TTGGTACTTTCGAAACCGCCGAAGATGCGGCTCGAGCTTATGATCGAGCCGCGATTGGAT 

TCCGTGGGCCAAGGGCTAAACTCAACTTCCCCTTTGTGGATTACACGTCTTGAGTTTCAT 

CTCCTGTTGCTGCTGATGATATAGGAGCAAAGGCAAGTGCAAGCGCCAGTGTGAGCGCCA 

CAGATTCAGTTGAAGCAGAGCAATGGAACGGAGGAGGAGGGGATTGCAATATGGAGGAGT 

GGATGAATATGATGATGATGATGGATTTTGGGAATGGAGATTCTTCAGATTCAGGAT^ATA 

CAATTGCTGATATGTTCCAGTGATAAATGAGCTCTTTCTTGTTGGCGTTTTTTGGAGT^ 

AGTGCAAGAAGAGATTGACACTGTGGCTTGTTTAAAGTGAACAAGAACAAGAAAGCATGT 

AATTAGTAGTCTCATTCTTTTGTTTGTGGTCAATTCTATGTTTATCTCATATAAAATCTG 

AGTTAAACCTATCTGAGGAGAGAGTAAATAAAGAGGTTAAGAA 

>G1751 Amino Acid Sequence (domain in AA coordinates: TBD) 

MHYPIO^TEFVGAPAPTRYQKEQIiSPEQELSVIVSAIjQHVI sgenetapcqgfs sdstvi 

SAGMPRLDSDTCQVCRIEGCIiGC^FFAPNQRIEKNHQQEEEITSSSNRRRESSPVAKKA 

EGGGKIRKRKNKKNGYRGWQRPWGKFAAEIRDPKRATRWL^^ 

GFRGPRAKLNFPFVDYTS S VS S PVAADDIGAKASAS ASVS ATDS VKAEQWNGGGGDCNME 

EI^OJMMMMMDFGNGDSSDSGNTIADMFQ* 

>G1752 (25.. 756) 

AAAAAAAAAAAAAAAAAAAAACTTATGGAATATTCCCAATCTTCCA 

AGTTCTTGGAGCTCATCACAAGAATCACTCTTATGGAACGAGAGCTGTTTCTTGGATCAA 
TCATCTGAACCTGAAGCCTTCTTTTGCCCTAATTATGATTACTCCGATGACTTTTTCTCA 
TTTGAGTCACCGGAGATGATGATTAAGGAAGAAATTCAAAACGGCGACGTTTCTAACTCC 
GAAGAAGAAGAAAAGGTTGGAATTGATGAAGAAAGATCATACAGAGGAGTGAGGAAAAGG 
CCGTGGGGGAAATTTGCAGCGGAGATAAGAGATTGAACGAGGAATGGAATTAGGGTTTGG 
CTCGGGACATTTGACAAAGCCGAGGAAGCCGCTCTTGCTTATGATCAAGCGGCTTTCGCC 
ACAAAAGGATCTCTTGCAACACTTAATTTCCCGGTGGAAGTGGTTAGAGAGTCGCTAAAG 
AAAATGGAGAATGTGAATCTTCATGATGGAGGATCTCCGGTTATGGCCTTGAAGAGAAAA 
GATTCTC^TCGAAACCGGCCTAGAGGGAAAAAGCGATCCTCTTCTTCTTCTTCTTCTTCT 
TCTAATTCTTCTTCTTGCTCTTCTTCT 

AAGCAGAGTGTTGTGAAGGAAGAAAGTGGTAGACTTGTGGTTTTTGAAGATTTAGGTGCT 
GAGTATTTAGAACAACTTCTTATGAGCT 

CCACTATTAAACTTTAATTTTGTGATAATTAATCTTGAAATTTGTTTTGTTCATT 

ATTTCTTTGGTTCTCTT AT r i M l u I w l^TTTGTTGTATCCAAATGAAATTATTGGAAGAGATG 

GTGATGTTAAAGTGTATATATATAAAAAAAAAA 

>G1752 Amino Acid Sequence (domain in AA coordinates: TBD) 
MEYSQSSMYSSPSSWSSSQESLLWNESCFLDQSSEPQAFFCPNYDYSDDFFSFESPEMMI 
KEEIQNGDVSNSEEEEKVGIDEERSYRGVRKRPWGKFAAEIRDSTRNGIRVWIjGTFDKAE 
EAALAYDQAAFATKGSIATLNFPVEVTOESLK 

GKKRSSSSSSSSSNSSSCSSSSSTSSTSRSSSKQSWKQESGTLWFEDLGAEYIjEQIiLM 

ssc* 

>G1763 (33 . . 977) TCGGTGGTGGCCACG 
GCGGCGAGCTTATGGAAGC^CTTCAACCTTTTT^^ 

ATCCTGCGTTTGCGTCCTGAAACGATGCGTTTGCGTCTGCCCCAAACGACCCATTTTCTT 

CTTCTTCTTACTATAATCCTCATGCATCTTTCTTCCCTTCAGATTCCACAACCACTTACC 

CGGATATTTATTCTGGATCCATGACCTATCCATCTTCATTCGGGTCGGATCTTCAACAAC 

CCGAAAACTACCAA5CTCAGTTCCATTACCAAAACACTATCACTTACACTCACCAAGACA 

ACAAC^CTTGCATGCTCAACTTCATTGAGCCGAGCCAACCGGATTTTATGA^ 

GTCCGAGTTCGGGTTCGGTTTCAAAACCGGCTAAGCTCTATAGAGGAGTGAGGCAAAGAC 

ATTGGGGAAAATGGGTCGCGGAGATCCGTTTACCCAGGAACCGAACCCGACTTTGGCTCG 

GAACATTCGACACGGCTGAAGAAGCCGCGTTGGCTTATGATCGCGCCGCGTTTAAGCTTC 

GTGGTGACTCGGCTCGGCTTAACTTCCCAGCTCTCCGATACCAAACCGGCTCGTCTCCGT 

CTGACGTTGGCGAATACGGACCTATTCAAGCTGCCGTTGACGCCAAGCTAGAAGCCATAT 

TAGCTGAGCCGAAGAATCAGCCGGGCAAAACGGAGAGGACGTCGAGGAAACGAGCTAAAG 

CCGCGGCTTCTTCAGCTGAGCAGCCGTCAGCGCCACAACAACATTCCGGGTCGGGTGAAA 

GTGATGGGTCGGGTTCACCGACTTCGGATGTTATGGTGCAGGAGATGTGCCAAGAGCCAG 
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AGATGCCATGGAATGAAAATTTCATGCTCGGCAAGTGTCCTTCTTATGAGATAGATTGGG 
CTTCAATTTTATCGTGAAAAATTAGGATTCAATTCATTTT^ 
TATTTTCTTTTAAACTTTAGGGTTATTAGCTGTGCGTAAAATT^ 
TATGAATGTAATGCAAGTGTGTAAATTATGGACT^GCTCAAGCTTTTTTGTTAAAA 

>G1763 Amino Acid Sequence (conserved domain in AA coordinates : 140-209) 
MADLFGGGHGGELMEALQPFYKSASTSASNPAFASS 

FPSHSTTTYPDIYSGSMTYPSSFGSDLQQPENYQSQFHYQNTITYTHQDITNTCMLNFIEP 

SQPDFMTQPGPSSGSVSKPAKIiYRGvl^QRHWGKWVAEIRLPRNRTRLWIjGTFD 

AYDRAAFKLRGDSARIjNFP ALRYQTGS S PSDVGEYGP I QAAVDAKLEAI LAE PKNQPGKT 

ERTSRKRAKAAASSAEQPSAPQQHSGSGESDGSGSPTSDVMVQEMCQEPEMPWNENFMLG 

KCPSYEIDWASILS* 

>G1766 (32.. 1216) 

AGGCTATTCTCGGAAAAACAAAGAATAAAGAATGAATTCGTTTTCACAAGTACCTCCTGG 

CTTCAGATTTGATCCTACTGATGAAGAACTTGT^ 

ATCAAAGAGAATAGAAATCGATATCATCAAGGATGTTGATC 

TGATCn?TCAAGAGTTATGCAAGATAGGAAACGAAGAGCAGAGCGAATGGTACTTCTTTAG 

TCATAAAGACAAGAAGTATCCCACGGGAACTCGAACCAATAGAGCCACGAAAGCAGGATT 

TTGGAAAGCCACTGGAAGAGACAAGGCTATATATATAAGACATAGTCTTATCGGTATGAG 

GAAAACACTTGTGTTTTACAAAGGAAGAGCCCCAAATGGTCAGAAATCCGATTGGATCA^ 

GCACGAATATCGCTTAGAAACAAGTGAAAATGGAACCCCTCAGGAAGAAGGATGGGTAGT 

ATGTAGGGTATTCAAGAAGAAATTGGCAGCGACAGTGAGGAAAATGGGAGATTACCATTC 

ATCACCATCGC^GCATTGGTACGATGAT(^GCTCTCTTTTATGGCCTCCGAGATCATTTC 

TAGCTCTCC^CGACAGTTTCOTCCCT^^ 

ATTGCCTTGTGGCCTCAATGCATTCAACAACAACJ^ 

GCTCGAGTTACATTACAATCAAATGGTAGAACATCAACAACAAAACCATCATCTTC 
ATCTATGTTTCTCCAGCTTCCTCAGCTCGAAAGCCCTACCAGTAATTGCAATTCTGACAA 
CAACAATAACACAAGAAATATTAGTAACTTGGAGAAAT 
ACAATTGCAACAAGGGAATCAAAGTTTCAGCTCTCT 

AATGACTACTGACTGGAGAGTTCTCGATAAATTTGTTGCTTGACAGCTTAGCAATGA 
AGAGGCTGC^GCCGTGGTTTCTTCTTCTTCTCATCAAAACAACGTCAAGATTGACACGAG 
AAACACGGGTTATCATGTGATAGATGAGGGAATAAATTTGCCGGAGAATGATTCTGAAAG 
GGTTGTTGAAATGGGAGAAGAGTATTCAAATGCTCATGCTGCTT 

TCAGATTGATCTCTAGAAATAGTGATAGAGAGATGAAAAAGATGCAAGGTGAATATATAT 
GAAAATACATGCACACTAGTGTTATTTATACTTAAAGATGGAAGGGGAAAAACAAGGAGT 
TATTTCCTGGATTTATGGAGGTTTTGTACATAATAAAAACCTACAACCATATGGTATTTT 
CTTTTGAAAAAAAAAAAAAAAAAAAAAAAA 

>G1766 Amino Acid Sequence (domain in AA coordinates: 10-153) 
MNSFSQVPPGFRFHPTDEELViyYXTjRKK^ 

EEQSEWYFFSHKDKKYPTGTRTNRATKAGFWKATGRDKAIYIRHSIjIGMRKTLVFYKG^ 

PNGQKSDWIMHEYRIjETSENGTPQEEGWVVCRVFKK 

LSFMASEIISSSPRQFLPimHYNRHHHQQTLPCGI^^ 

HQQQNHHLRESMFLQLPQIiESPTSNC^SDNNNNTRNISNIiQKSSNISHEEQLQQGNQSFS 
SLYYDQGVEQMTTDWRVTjDKFVAS QLSNDEEAAAWS S SSHQNNVKIDTRNTGYHVIDEG 
INLPENDSERWEMGEEYSNAHAASTSSSCQIDL* 
>G1767 (1. .1596) 

ATGGATACTCTCTTTAGACTAGTCAGTCTCCAACAACAACAACAATCCGATAGTATCATT 
ACAAATCAATCTTCGTTAAGC^GAACTTCCACCACCACTACTGGCTCTCCACAAACTGCT 
TATCACTACAACTTTCCACAAAACGACGTCGTCGAAGAATGCTTCAACTTTTTCATGGAT 
GAAGAAGACCTTTCCTCTTCTTCTTCTCACC^ 

ACTTACTACTCTCCTTTCACTACTCCCACCCAATACCATCCCGCCAC^TCATa^CCCCT 
TCCTCCACCGCCGC^GCCGC^GCTTTAGCCTCGCCTTACTCCTCCTCCGGCCACCATAAT 
GACCCTTCCGCGTTCTCCATACCTCAAACTCCTCCGTCCTTCGACTTCTCAGCCAATGCC 
AAGTGGGCAGACTCGGTCCTTCTTGAAGCGGCACGTGCCTTCTCCGACAAAGACACTGCA 
CGTGCGCAACAAATCCTATGGACGCTCAACGAGCTCTCTTCTCCGTACGGAGACACCGAG 
CAAAAACTGGCTTCTTACTTCCTCC2AAGCTCTCTTCAACCGCATGACCGGTTCAGGCGAA 
CGATGCTACCGAACCATGGTAACAGCTGCAGCCACAGAGAAGACTTGCTCCTTCGAGTCA 
ACGCGAAAAACTGTACTAAAGTTCCAAGl^AGTTAGCCCCTGGGCCACGTTTGGACACGTG 
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GCGGC^lAACGGAGCAATCTTGGAAGCAGTAGACGGAGAGGCJU^ 

ATAAGCTCCACGTTTTGCAOTCAATGGCCGACTCTTCTAGAA.GCTTTAGCCACZAAGATCA 
GACGACACGCCTCACCTAAGGCTAACCAC^GTTGTCGTGGCCAACAAGTTTGTCAACGAT 
CAAACGGCGTCGCATCGGATGATGAAAGAGATCGGAAACCGAATGGAGAAATTCGCTAGG 
CTTATGGGAGTTCCTTTCAAATTTAACATTATTCATCACGTTGGAGATTTATCTGAGTTT 
GATCTCAACGAACTCGACGTTAAACCAGACGAAGTCTTGGCCATTAACTGCGTAGGCGCG 
ATGCATGGGATCGCTTCACGTGGAAGCCCTAGAGACGCTGTGATATCGAGTTTCCGACGG 
TTAAGACCGAGGATTGTGACGGTCGTAGAAGAAGAAGCTGATCTTGTCGGAGAAGAAGAA 
GGTGGCTTTGATGATGAGTTCTTGAGAGGGTTTGGAGAATGTTTACGATGGTTTAGGGTT 
TGCTTCGAGTCATGGGAAGAGAGTTTTCCAAGGACGAGCAACGAGAGGTTGATGCTAGAG 
CGTGGAGCGGGACGTGCGATCGTTGATCTTGTGGCTTGTGAGCCGTCGGATTCCACGGAG 
AGGCGAGAGACAGCGAGGAAGTGGTCGAGGAGGATGAGGAATAGTGGGTTTGGAGCGGTG 
GGGTATAGTGATGAGGTGGCGGATGATGTCAGAGCTTTGTTGAGGAGATATAAAGAAGGT 
GTTTGGTCGATGGTACAGTGTCCTGATGCCGCCGGAATATTCCTTTGTTGGAGAGATCAG 
CCGGTGGTTTGGGCTAGTGCGTGGCGGCCAACGTAA 

>G1767 Amino Acid Sequence (domain in AA coordinates: 255-272) 
MDTLFRLVSLQQQQQSDS I ITNQS SLSRTSTTTTGS PQTAYHYNFPQNDWEECFNFFMD 
EEDLSSSSSHHNHHNHNNPOTTYYSPFTTPTQYH^ 

DPS AFS I PQTPPS FDF S ANAKWADS VLLEAARAFSDKDTARAQQILWTIjNELS S P YGDTE 
QKLASYFLQALFNRMTGSGERCYRTMVTAAATEKT^ 
AANGAILRAVDGEAKliHIVDISSTFCTQWPTIiLEA 
QTASHRMMKEI GNRMEKFARIiMGVPFKFNI IHHVGDLSEFDLI^ 

MHGIASRGSPRDAVT S SFRRLRPRIVTVVEEEADLVGEEEGGFDDEFIiRGFGECIjRWFRV 
CFES WEE S FPRTSNERLMLERAAGRAIVDLVACEPSDSTERRETARKWSRRMRNSGFGAV 
GYSDEVADDVRALLRRYKEGVW SMVQCPDAAGI FLCWRDQPVVWAS AWRPT* 
>G1778 (1..627) 

atgatgggatacgaaacaaactctaat^ 

CAAAACCACCACAACTACGATCCTTATAATAATTTCTCTTCATCAACTT 

ACTCTCTCACTTGGAACACCCTCTACTCGTCTCGACGACC^CCATAGATTTTCTTCTGCT 

AATTCTAACAACATCTCCGGCGACTTTTATATTCACGGAGGAAACGCTAAGACTTCTTCG 

TACAAGAAGGGTGGTGTTGCTCATAGCCTACCTCGCCGTTGTGCTAGCTGCGACACCACT 

TCAACTCCTCTATGGAGAAACGGACCAAAAGGACCTAAGTCGTTATGTAACGCGTGTGGA 

ATCCGATTCAAGAAAGAGGAGAGGCGTGCGACGGCCAGAAACTTAACGATCTCCGGTGGA 

GGTTCATCAGCGGCAGAAGTCCCAGTAGAGAATTCGTACAACGGAGGTGGAAACTATTAC 

AGTCATCATCAT<^TCACTATGCCTCGTCGTCGCCGTCGTGGGCTGATC^GAACACACAA 

AGAGTTCCATATTTCTCACCGGTTCCGGAGATGGAATATCCCTACGTGGATAACGTCACG 

GCTTCTTCTTTTATGTCTTGGAATTGA 

>G1778 Amino Acid Sequence (domain in AA coordinates : 94-119) 

MMGYQTNSNFSMFFSSENDDQNHHNYDPYNNFSSSTSVDCTLSIiGTPSTRLDDHHRFSSA 

NSNNISGDFYIHGGNAKTSSYKKGGVAHSLPRRCASCDTTSTPIjWRNGPKGPKSLCNACG 

IRFKKEERRATARl^TISGGGSSAAEVPVENSYNGGGNYYSHHHHHYASSSPSWAHQNTQ 

RVPYFSPVPEMEYPYVDNVTASSFMSWN* 

>G1789 (108,. 413) 

CAAGGACTCTGCGACATCTGTGCAACATATCATOT 
TTTATTACTACACAAAACCAAACATCATCA 

GAATGTCTTCTTATGGCTCTGGCTCATGGACTGTTAAGCAGAACAAAGCCTTTGAGCGTG 
CTCTAGCAGTCTATGACCAAGACACTCCGGACCGTTGGCACAATGTTGCTAGAGCTGTTG 
GTGGTAAAACACC^AAGAAGCTAAGAGACT^GTATGACCTTCTAGTTCGTGACATCGAAA 
GCATCGAGAATGGTCACGTGCCATTCCCTGACTACAAGACra 

GAGGCAGGCTGCGTGATGAGGAAAAGAGGATGAGAAGCATGAAGCTGCAGTGAGACAAGA 
AGCAAGZVAAACCTAACTACGTATGATCGTCAAAATAAAAGAGAATCACTTCA 
TGTTTTTTTCAATGTCTGACGAATCAATGTTTTTT^ 
TAAGAAATGGTTTTTTTTTCGAGGCAACAAAAAAAAAA 

>G1789 Amino Acid Sequence (domain in AA coordinates: 1-50) 
MASGSMSSYGSGSWTVKQNKAFERAIAVYDQDTPDRWHNV 
RDIES IENGHVPFPDYKTTTGNSNRGRLRDEEKRMRSMKLQ * 
>G1790 (63.. 1346) 
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GAAAAAGACTTCACTTTTTTTT^ 

CAATGGAGAATTTCGTCGACGAGAATGGTTTTGCTTCTCTAAACCAAAACATCTTCA 

GTGATC^U^GAACAC^TGAAAGAAGAAGATTTTCCATTCGAAGTCGTCGACCAATCAAAAC 

CTACAAGCTTTCTTCAAGATTTTCACCATCTTC 

ATCATCATGGCTCCTCATCTTCA<^TCCTO 

TCAATAATGCTCCTTTCGAGCATTGCTCTTACCAAGAJ^ 

CTAAACC^AATTTGATGAATCATCATCATTTC 

GTAATCATCATCATCATCAAGAGATCAATTTGGTCGATGAACATGATGATCCTATGGACT 
TGGAGCAAAACAACATGATGATGATGAGGATGATCCCTTTTGATTACCCTCCTACAGAGA 
CTTTCAAACCTATGAACI^CGTAATGCCAGATC 

ATTGTTATAGAG CAACGAGTTTCAACAAGAC CAAAC CATTTCTTACACGAAAGTTGTCTT 
CTTCTTCTTCATCATCATCATGGAAAGAAACCAAAAAGT 

GGACTGCTGAAGAAGACAGGGTACTGATTCAACTCGTGGAGAAGTATGGATTGCGTAAAT 

GGTCGCATATCGCTCAAGTGTTACCGGGAAGAATCGGGAAACAATGTAGAGAGAGGTGGC 

ATAACCATTTGAGAGCTGACATTAAGAAAGAAACATGGAGTGAAGAAGAGGACAGAGTGT 

TGATAGAATTTCACAAAGAGATTGGAAACAAATGGGCAGAGATTGCGAAAAGACTCCCGG 

GAAGAACAGAGAACTCGATCAAGAACCATTGGAACGCAACAAAAAGAAGACAATTOT 

AAAGAAAATGTAGATCTAAGTATCCAAGACCTTCTCTGTTGCAGGATTAC^TCAAGAGCT 

TGAATATGGGAGCTTTGATGGCTTCTTCTGTTCCTGCAAGAGGTAGACGCAGAGAGAGTA 

ATAACAAGAAGAAGGATGTTGTTGTTGCGGTTGAGGAGAAGAAGAAGGAAGAGGAGGTGT 

ATGGACAAGACAGGATTGTGCCTGAATGTGTGTTTACTGATGATTTTGGATTCAATGAGA 

AGCTGCTTGAGGAAGGATGTAGCATTGACTCTTTGCTTGATGACATTCCTGAGCCTGACA 

TTGATGCTTTTGTTGATGGGCTCTGATTTGTATTTTTTATTCTGCTTGTTTCAGTTTTGT 

TGTTTTTTGTTTGTCTTTTTATACGAGACAGAT^ 

ATATAAAATATTTTGCTTTTTAAAAAAAAAAAAAAAAAAAAAAA 

>G1790 Amino Acid Sequence (conserved domain in AA coordinates : 217-316) 
MENFVDENGFAS LNQNIFTRDQEHMKEEDFPFE WDQS KPTS FLQDFHHLDHDHQFDHHH 
HHGS S S SHPLLS VQTTS S C I3OTAPFEHCSY QENMVlDFYETKPNIiMNHHHFQAVENS 
NHHHHQEIl^VDEHDDPMDIiEQNIsnwiMMMRMIPFDYPPT^ 

CYRATS FNKT KP FLTRKLS S S S S S S S WKETKKS TLVKGQWTAEEDRVL I QLVEKYGLRKW 

SHIAQVIiPGRIGKQCRERWHNIILRPDI KKETWSEEEDRVXj I E FHKE IGNKWAEI AKRLPG 

RTENS IKNHWNATKRRQFS KRKCRS KYPRPSLLQDYIKSLNMGALMAS S VPARGRRRE SN 

NKKKDVWAVEEKKKEEEWGQDRIVPECT IDSLLDDI PQPD I 

DAFVHGL* 

>G1791 (36.-455) 

ATGTACATGCAAAAACAAAAACCTTAAAAGCTTTCATGGAACGTATAGAGTCTTATAACA 
CGAATGAGATGAAATACAGAGGCGTACGAAAGCGTCCATGGGGAAAATATGCGGCGGAGA 
TTCGCGACTCAGCTAGACACGGTGCTCGTGTTTGGCTTGGGACGTTTAACACAGCGGAAG 
ACGCGGCTCGGGCTTATGATAGAGCAGCTTTCGGCATGAGAGGCCAAAGGGCCATTCTCA 
ATTTTCCTCACGAGTATCAAATGATGAAGGACGGTCCAAATGGCAGCCACGAGAATGCAG 
TGGCTTCCTCGTCGTCGGGATATAGAGGAGGAGGTGGTGGTGATGATGGGAGGGAAGTTA 
TTGAGTTCGAGTATTTGGATGATAGTTTATTGGAGGAGCTTTTAGATTATGGTGAGAGAT 
CTAACCAAGACAATTGTAACGACGCAAACCGCTAGATCATCACTACTTACTTACAGTGTA 
ATGTTTTTGGAGTAAAGAGTAATAATCAATATAATATACTTTAGTTTAGGAAAAAAAAAA 
AAAAAAAAA 

>G179i Amino Acid Sequence (domain in AA coordinates: TBD) 
MERIES YNTNEMKYRGVRKRPWGKYAAE IRD SARHGARVWIjGTFNTAEDAARAYDRAAFG 
MRGQRAILNFPHEYQMMKDGPNGSHENAVAS S S SGYRGGGGGDDGREVTEFEYIjDDSLLE 
ELLDYGERSNQDNCNDANR* 
>G1793 (59.. 1783) 

AGTGATTTATTGATTAACCCAAACACAAAATAAACAGATTTGACTC^AA^ 
TGAATACAACCTTGGCTTGGTCAGCGACCATATGGAC 

GAATATGATCAATCCACACGGTGGAGGAGGAGATGAAGGAGGAGAGGTTCCAAAAGTGGC 
CGATTTTCTCGGTGTGAGCAAACCGGACGAAAACCAATCCAACCACCTAGTAGCTTACAA 
CGACTCAGACTACTACTTCCATACCAATAGCTTGATGCCTAGCGTCCAATCAAACGATGT 
CGTTGTAGCZAGCTTGTGACTCCAATACTCCTAACAACAGTAGCTATCATGAGCTTCAAGA 
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GAGTGCTCAC^TCTAC^GTCACTTACTTTGTCCATGGGGACCACCGCTGGTAATAATGT 

TGTAGACATU^GCTTCACCATCCGAGACCACCGGGGATAACGCTAGCGGTGGAGCACTAGC 

CGTTGTTGAGACGGCC^CGCCAAGACGTGCATTGGACACTTTCGGACAACGAACCTC^ 

CTATCGTGGTGTCACAAGACATCGATGGACTGGTCGATATGAGGCTCATCTATGGGATAA 

TAGTTGTAGAAGGGAAGGCCAGTCTAGGAAAGGAAGACAAGTTTACTTGGGTGGATATGA 

CAAAGAAGATAAAGCAGCAAGATCATATGATCTAGCTGCACTTAAGTACTGGGGTCCTTC 

AACTACTACTAATTTCCCCATTACAAACTACGAGAAAGAAGTAGAGGAAATGAAGCACAT 

GACGAGACAAGAGTTCGTGGCTGCCATTAGAAGGAAAAGTAGTGGATTTTCGAGAGGCGC 

TTCGATGTATCGAGGAGTTACAAGGCATCACCAACATGGAAGATGGCAAGCAAGGATCGG" 

CCGAGTCGCCGGAAAGAAAGACCTCTACTTGGGAACTTTTAGCACTGAGGAAGAAGCAGC 

AGAAGCTTACGATATAGCTGCAATAAAGTTTAGAGGACTTAATGCAGTGACGAACTTCGA 

GATCAACCGGTACGACGTGAAAGCCATTCTAGAGAGTAGCACTCTTCCCATCGGAGGAGG 

CGCAGCTAAACGGCTCAAAGAAGCTCAAGCTCTTGAGTCTTCAAGGAAACGCGAGGCGGA 

GATGATAGCCCTTGGTTCAAGT1TCCAGTACGGTGGTC 

CACCTCATCAAGACTTCAGCTTCAACCTTACCCT 

TTTTCTATCTCTTCAGAACAATGACAT 

CTCCTCTTTTAATCACCATAG CTATATCCAGACACAACTTCATCTCCAC CAACAGAC C AA 
GAATTACTTGCAGCAACAGTCGAGCCAGAACTCTC^^ 

TAGCAATCCGGCTCTGCTTCATGGACTTGTCTCTACCTCTATCGTTGACAACAATAATAA 
CAATGGAGGCTCTAGTGGGAGCTACAACACTGCAGCATTTCTTGGGAACCACGGTATTGG 
TATTGGGTCCAGCTCGACTGTTGGATCGACCGAGGAGTTTCCAACCGTTAAAACAGATTA 
CGATATGCCTTCCAGTGATGGAACCGGAGGGTATAGTGGTTGGACCAGTGAGTCTGTTCA 
GGGGTCAAACCCTGGTGGTGTTTTCACTATGTGGAATGAGTAAACAAGGATCTCTTTCTT 

GCGGCACAAGGAATGGGT 

>G1793 Amino Acid Sequence (conserved domain in AA coordinates : 179-255 , 281-349) 

MNSNNWLGFPLS PNNS SIjPPHEYNIjGI»VSDHMDNPFQTQEWNM INPHGGGGDEGGEVPKV 

ADFLGVSKPDENQSNHLVAYNDSDYYFHTNSIiMPSVQSl^^ 

ESAHI^QSLTLSMGTTAGNlttATDra 

IYRGV^RHRWTGRYEAHLWDNSCRREGQSRKGRQV^ 

S TTTNPPITNYEKEVEEMKHMTRQEFVAAIRRKS S GFSRGASMYRG VTRHHQHGRWQARI 
GRVAGNKDLYLGTFSTEEEAAEAYDI AAI KFRGLNAVTNFE INRYDVKAIIiES STLP I GG 
GAAKRLKEAQALESSRIOE^EAEMIAIjGSSFQyGGGSSTGSGSTSSRLQLQPYPLSIQQPIiE 

PFLSLQNITOISHYNNNNAHDSSSFNHHSYIOTQ 

HSNPALLHGLVSTS I VXJNNNNNGGSSGSYlirrAAFLGNHGIGIGSSSW 

YDMPSSDGTGGYSGWTSESVQGSNPGGVFTMWNE* 

>G1795 (27.. 422) 

ACAAACACGCAAAAAGTCATTAATATATGGATCAAGGAGGTCGAGGTGTCGGTGCCGAGC 
ATGGAAAGTACCGGGGAGTTCGGAGACGACCTTGGGGAAAATATGCAGCAGAGATACGAG 
ATTCGAGGAAGCACGGTGAACGTGTGTGGCTTGGAACGTTCGATACGGCAGAGGAAGCGG 
CTAGAGCCTATGACCAAGCTGCTTACTCCATGAGAGGCCAAGCAGCAATCCTTAACTTCC 
CTCATGAGTATAACATGGGGAGTGGTGTCTCTTCTTCCACCGCCATGGCTGGATCTTCCT 
CCGCCTCCGCCTCCGCTTCTTCTTCTTCTAGGCAAGTTTTTGAATTTGAGTACTTGGATG 
ATAGTGTTTTGGAGGAGCTCCTTGAGGAAGGAGAGAAACCTAACAAGGGCAAGAAGAAAT 
GAGCGAGATATAATTCATGATTATTTCTAA 

>G1795 Amino Acid Sequence (domain in AA coordinates: 12-80) 
MDQGGRGVGAEHGKYRG WRRPWGKYAAE IRD SRKHGER 

SMRGQAAILNFPHEYNMGSGVS SSTAMAGS S SASASASSS SRQVFEFEYIiDDSVIjEEIjIjE 

EGEKPNKGKKK* 

>G1800 (61. .894) 

CCATTATCATATCCTeTTCTTCCTTCTTCACTATCAATCTTCTTCTCCACTAG?^ 

ATGGAGAAATCATCCTCAATGAAACAATGGAAGAAGGGTCCTGCTCGGGGTAAAGGCGGT 

CCAGAAAACGCTCTTTGTCAGTACCGTGGAGTCAGGCAAAGGACTTGGGGCAAATGGGT^ 

GCTGAGATCAGAGAGCCCAAGAAGAGGGCAAGACTTTGGCTTGGCTCTTTCGCTACAGCT 

GAAGAAGCAGCTATGGCTTATGATGAGGCTGCCTTGAAACTCTATGGGCACGACGCATAC 

CTCAACTTACCTCATCTTCAGCGGAATACAAGACCTTCTCTGAGTAACTCTCAGAGGTTC 

AAATGGGTACCTTCAAGGAAGTTTATATCTATGTTTCCTTCATGTGGTATGCTAAACGTG 

AATGCTCAGCCTAGTGTTCACATAATCCAGCAAAGACTAGAAGAACTCAAGAAAACTGGA 
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CTTTTATCTCAATCCTATTCTTCTAGTTCTTCCTCCACCGAATCAAAAACTAATACTAGC 
TTTCTTGATGAGAAGACCAGCAAGGGAGAAACAGACAATATGTTCGAAGGTGGTGATCAG 
AAGAAACCAGAGATCGACCTGACCGAGTTTCTTCAGCAACTAGGAATCTTGAAGGATGAA 
AATGAAGCAGAACCAAGTGAGGTAGCAGAGTGTCATTCCCCTCCACCATGGAACGAGCAA 
GAAGAAACTGGAAGTCCTTTC^GAACTGAGAATTTCAGCTGGGATACCCTGATCGAGATG 
CCAAGAAGTGAAACCACAACTATGCAATTTGACTC 

GAGGATGATGTATCCTTCCCTTCCATCTGGGACTACTACGGAAGCTTAGATTGAGTAAAA 
GCAATTTAAGGTAGATCAAGATTCAGAAGTACACAA^ 

TTTGGAAAAGAGACATAGGTAGTGAGAGTGCAGTCTTTTATTATGCAGCAATAAAGTGAG 

TCAGTGTACAACCGAGTTGTTCGCTTTTTTT^ 

CGCTAAAAAAAAAAAAAAAAAAAAAAA 

>G1800 Amino Acid Sequence (domain in AA coordinates: TBD) 
MEKS S S MKQ WK3CGPARGKGGPQNALCQ YRGVRQRTWGKWVAE I REPKKRARLWLGS FATA 
EEAAMAYDEAALKLYGHDAYIiN^ 

NAQPSVHIIQQRIjEEIiKKTGLLSQSYSSSSSSTESKT^ 
KKPEIDLTEFIiQQLGILKI>ENEAEPSEVAEra^ 
PRSETTTMQFDSSNFGSYDFEDDVSFPS IWDYYGSLD* 
>G1806 (1..1356) 

ATGCAGAGCAGCTTCAAAACCGTTCCTTTCACTCCTGATTTCTACTCTCAATCCTCTTAC 
TTCTTGAGAGGAGATAGTTGTCTTGAGGAGTT 

GAAGAAGCTATCGATTTAAGTCCAAATGTCACTATTGCTTCAGCTAACTTACACTACACG 
ACGTTTGATACGGTTATGGATTGTGGTGGTGGTGGTGGTGGCTTGAGGGAGAGACTTGAA 
GGAGGAGAAGAGGAGTGTTTGGAC^CAGGGCAATTAGTGTACCAGAAAGGGACAAGATTA 
GTAGGAGGAGGAGTAGGAGAAGTGAACAGCAGTTGGTGTGATTCGGTTTCAGCTATGGCT 
GATAACA.GTCAACATACTGAGACTTCCACAGATATTGATACTGATG 

AATGGAGGTCATCAAGGGATGCTATTGGCTACAAATTGTTCAGATCAATCCAATGTGAAA 
TCTAGTGATCAAAGGACACTTCGTCGACTTGCTCAGAACCGGGAGGCTGCTAGGAAAAGT 
CGGTTGAGGAAAAAGGCCTATGTTCAGCAACTTGAGAATAGTC 

CTAGAGGAAGAGCTCAAAAGAGCTCGCCAACAGGGATCTTTGGTTGAAAGAGGAGTTTCA 
GCGGATCACACGC^TTTGGCAGCAGGAAATGGTGTCTTTTCATT^ 

CGTTGGAAGGAGGAAC^TCAAAGAATGATCAACGACTTAAGATCGGGTGTGAATTCGCAG 
TTAGGTGACAACGATCTACGCGTTCTAGTGGATGCTGTGATGAGTCACTATGATGAAATA 
TTCAGGCTAAAGGGAATTGGCACTAAAGTTGAAGTCTTTCATATGCTCTCAGGCATGTGG 
AAGACACCTGCCGAGAGATTTTTCATGTGGTTAGGTGGATTTAGATCATCAGAGTTACTT 
AAGATATTGGGGAACCATGTGGATCCATTGACGGACCAGCAGTTGATAGGCATTTGCAAC 
CTTCAGCAATCGTCTCAACAAGCAGAGGATG 

CAATCACTTCTCGAGACGCTTTCTTCTGCTTCTATGGGTCCAAACTCTTCAGCAAATGTT 
GCAGATTATATGGGTCATATGGCTATGGCTATGGGCAAACTTGGCACTCTTGAAAACTTC 
CTTCGCCAGGCTGATTTATTGAGGCAACAAACTCTGCAACAGCTTCACAGAATTCTCACC 
ACACGACAAGCTGCTCGCGCCTTTTTGGTCATCCACGATTATATTTCTCGGCTTAGAGCA 
CTTAGCTCTCTATGGTTAGCCAGACCTAGAGACTAA 

>G1806 Amino Acid Sequence (domain in AA coordinates 165-225) 
MQSSFKWPFTPDFYSQSSYFFRGDSCLEEFHQPWGFHHEEAIDLSPNVTIASANLHYT 
TFDTVMDCGGGGGGLRERLEGGEEECLDTGQLVYQKGTRIjVGGGVGEVNSSWCDSVSAMA 
DNSQHTDTSTDIDTDDKTQLNGGHQGMLIA^ 

RLRKmWQQLENSRXRLAQIiEEELKRARQQGSLVERGVSADHTHLAAGNGWSFEL 

RWKEEHQRM INDIiRSGVNS QLGDNDLRVIjVDAVMSHYDE I FRLKGI GTKVEVFHMLSGMW 

KTPAERFFMWLGGF^SSEIiLKILGlIHVDPLTDQQLIGIC^njQQSSQQAEDAIjSQGMEALQ 

QSLLETLSSASMGPNSSANVADYMGHMAMAM^ 

TRQAARAFLVIHDYISRLRALSSLWLARPRD* 

>G1811 (93.. 827) 

AAAGGAGC^TTGGTATCTCAAACAATATTTGCCCTTTCTCTATCTCT 
TTGCCATCTCTTTCTCTCTCCCTCTCTTTCAAATGTCAATAAACCAATACTC^^GCGATT 

TCCACTACCATTCTCTCATGTGGCAACAACAGCAGCAACAACA^ 

TCGTGGAAGAAAAAGAAGCTCTTTTCGAGAAACCCTTAACCCCAAGTGACGTCGGAAAAC 
TCAACCGCCTCGTCATCCCAAAACAGCACGCCGAGAGATACTTCCCACTAGCGGCCGCCG 
CCGCAGACGCCGTGGAGAAAGGACTTCTCCTCTGCTTTGAGGACGAGGAAGGTAAACCAT 
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GGAGATTCAGATACTCGTACTGGAACAGTAGCCAGAGTTATGTCTTGACCAAAGGCTGGA 
GCAGATACGTCAAGGAGAAGCACCTTGACGCCGGAGACGTCGTTCTCTTCCATCGACACC 
GTTCAGACGGCGGAAGATTCTTCATTGGCTGGAGAAGACGCGGTGACTCTTCTTCCTCCT 
CCGACTCTTATCGCCATGTTCAATCCAATGCCrrCGCTCCAATATTATCCTCATGCAGGGG 
CTCAAGCGGTGGAGAGCCAAAGAGGCAACTCGAAGACATTAAGACTGTTCGGAGTGAACA 
TGGAGTGCCAGCTAGATTCGGACTGGTCCGAGCCATCCACACCTGACGGTTCTAACACAT 
ATACAACCAATCACGACCAGTTTCATTTCTACCCTCAACAACAA 

ACTACATGGACATAAGTTTCACAGGAGATATGAACCGGACGAGCTAGAAGCCCACAAGGA 
TTAAAAAAAAGCTTCACATCTGGTCCTGTTATGTTGTCATAGATGTTGATTCCTTAATTT 
TACACAAGCTTCATTTTGGATTATTTAAAGT^ 
TCTCTCAATTTTCACTCTCTTCCTTTTTCTTCTTATGT^ 

ACACTTGTATAGAGAATTCAAAGTTCTGGCTATTTTCGAAAGTTATCTTTTCTCTTAAAA 
AAAAAAA 

>G1811 Amino Acid Sequence (domain in AA coordinates: TBD) 
MSINQYSSDFHYHSLMWQQQQQQQQHQXTOVTOEKEALFEK^^ 

ER YF PLAAAAAD AVE KGLLLC FEDEEGKP WRFRYS YWNS S Q S YVTjTKGW S R YVKE KHIjD A 
GDVVLFHRHRSDGGRFFIGWRRRGDSSSSSDSYRHVQSNASLQYYPHAGAQAVESQRGNS 
KTLRLFGVNMECQLDSDWSEPSTPDGSNTYTTNHDQFHFYPQQQHYPPPYY^ 

NRTS* 

>G182 (74.. 1366) 

CGTCGACGATCAGATTCTTGCGTATAGCTGTATATATACACCAAGATACACTCATCATCG 

TCATATATAGATTATGTGCAGCGTCTCTGAGCTTCTTGACATGGAAAACTTCCAAGGAGA 

CTTAACCGACGTCGTACGAGGAATCGGAGGCCACGTGTTATCACCGGAGACTCCTCCCTC 

GAAC^TCTGGCCTCTTCCTCTGTCACATCCAACACCATCACCGTCAGATCTTAACATAAA 

CCCCTTCGGAGATCCGTTTGTGAGCATGGACGATCCACTCCTCCAAGAACTAAACTCCAT 

CACAAACTCCGGCTATTTCTCCACCGTAGGAGATAACAACAACAACATTCACAACA 

TGGTTTCTTGGTTCCAAAGGTATTTGAGGAGGATCATATAAAGAGTCAATGTAGT 

CCCAAGAATCCGGATCTCGCATAGTAACATCATCCACGATTCTTCTCCGTGTAATTCTCC 

GGCCATGTCGGCTCACGTTGTCGCAGCCGCAGCAGCCGCCTCGCCGAGAGGCATCATCAA 

CGTAGACACAAACAGTCCTAGAAACTGTCTATTGGTTGATGGTACCACGTTCTCCTCGCA 

GATTCA.GATATCTTCCCCTCGGAATCTAGGCCTTAAAAGAAGGAAGAGTCAGGCAAAGAA 

GGTGGTGTGTATTCCGGCCCCGGCTGCAATGAACAGCCGATCAAGCGGAGAAGTGGTTCC 

ATCGGATCTATGGGCTTGGCGTAAATACGGTCAAAAACCTATCAAAGGCTCTCCTTTTCC 

AAGGGGTTATTATAGATGCAGCAGCTCAAAAGGTTGTTCAGCAAGAAAGCAAGTCGAAAG 

AAGCCGAACCGATCCAAACATGTTGGTGATTACATATACCTCCGAACATAACCATCCTTG 

GCCCATCC^ACGC^U^CGCTCTCG^^ 

CCCTAATCCTTCCAAACCCTCAACCGCAAACGTAAACTCCTCATCCATTGGCTCCCAAAA 
CACC^TCTACTTGCCTTCCTCCACCACTCCTCCTCCTACCCTCTCATCCTCCGCCATCAA 
AGATGAACGAGGGGACGATATGGAGTTGGAAAACGTAGATGATGATGATGATAACCAGAT 
TGCTCCATACAGACCGGAGCTTCATGATCATCAGCACC7UVCC7VGATGATTTCTTTGCAGA 
TCTTGAAGAGCTAGAAGGAGATTCTCTAAGCATGTTGCTTTCTCATGGCTGTGGCGGCGA 
CGGGT^GGATAAAACGACCGCGTCCGATGGGATCAGCAATTTCTTCGGGTGGTCGGGAGA 
TAATAATTATAATAATTACGACGACCAAGACTCAAGGTCGXTATAGTATAGTGTTAATTA 
CAGGTAAACAAATTATATTAAATTAAGTTGAGCTTGTGAAAATGAAGATCATATGGTCTG 
GTCAGGTTGGGGGC 

>G182 Amino Acid Sequence (conserved domain in AA coordinates : 217-276) 
MCSVSELLDMENFQGDLTDVWGIGGHVTiSPETPPSNIWPLPIiSHPTPSPSDLNINPFGD 
PFVSl^DPLLQELiraiTNSGYFSTVGDI^^ 
ISHSNIIHDSSPOSTSPAMSAHWAAAAAASPRGIINTO^ 

S PRNLGLKRRKS QAKKWCI PAPAAMNS RS SGEWPSDLWAWRKYGQKP IKGS PFPRGYY 
RCSSSKGCSARKQV^RSRTDPOTOYreTYT^ 

KPSTANVNS S S IGSQNTI YLPS STTPPPTLS S SAI KDERGDDMEIjENVDDDDDNQIAPYR 
PELHDHQHQPDDFFADLEELEGDSLS^LSHGCGGDGKDKTTASDGISNFFGWSGDNNYN 

NYDDQDSRSL* 
>G1835 (1..969) 

ATGATTGGAACAAGCTTCCCCGAGGATCTTGATTGTGGCAACTTCTTTGACAACATGGAT 
GATCTCATGGACTTTCCCGGTGGAGATATCGATGTCGGTTTCGGCATAGGTGACTCCGAC 
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TCTTTCCCTACCATCTGGACCACTCATCACGACACGTGGCCTGCCGCTTCTGATCCTCTC 
TTCTCTTCCAACACCAACTCTGATTCATCACCTGAGCTCTATGTTCCGTTTGAGGACATT 
GTTAAGGTGGAAAGACCTCCAAGCTTTGTAGAGGAAACATTGGTTGAGAAGAAGGAAGAT 
TCGTTTTCGACAAACACTGATTCATCATCTTCTCATAGCCAATTCAGGAGCTCAAGTCCA 
GTGTCGGTTCTCGAGAGCAGCTCCTCCTCGTCTCAAACC^CCAACACAACCTCCCTTGTT 
CTCCCTGGAAAGCACGGTCGTCCACGCACAAAACGCCCTCGTCCACCTGTCCAGGATAAA 
GATAGAGTCAAAGACAATGTGTGCGGTGGTGACTCGCGCCTCATCATTAGAATACCGAAA 
C^GTTTCTCTCTGATCACAACAAGATGATCAACAAGAAGAAGAAGAAGAAGGCCAAGATT 
ACTTCTTCCTCTTCTTCGTCCGGGATTGATCTTGAAGTCAATGGAAACAACGTCGATTCG 
TATTCTTCAGAGCAATATCCGCTTAGGAAATGTATGCACTGTGAGGTCACCAAGACTCCA 
CAGTGGAGGCTTGGTCCAATGGGTCCAAAGACACTTTGCAATGCGTGCGGTGTACGTTAC 
AAATCAGGGAGGCTTTTCCCGGAGTACCGTCC^GCTC 

CTTCACTCAAACTC^CACAAGAAAGTGGCTGAAATGAGAAACAAGAGATGCAGTGATGGT 
AGCTACATAACCGAAGAGAATGATCTGC^GGGCTGATTCCGAACAATGCCTACATTGGC 

GTAGACTAA 

>G1835 Amino Acid Sequence (domain in AA coordinates: 224-296) 
MIGTSFPEDLDCGOTFDNMDDL^ 

FSSNT^SDSSPELYVPFBDrVKVERPPSFVEETIjVEKKEDSFSTNTDSSSSHSQFRSSSP 
VSVLES SS S S SQTTNTTSLVLPGKHGRPRTKRPRPPVQDKDRVKDNVCGGDSRL I IRI PK 
QFLSDHNKMINKKKKKKAKITSSSSSSGIDLEVN^^ 

QWRLGPMGPKTLCNACGVRYKSGRLFPEYRPAASPTFTPAIjHSNSHKKVAEMRN 
S YI TEENDLQGLIPNNAYIGVD * 
>G1836 (47.. 610) 

ATAAC^UVGCCTAGAACACTAGAAACTTCAAAAAAGAAAAAAATCTTATGGAGAACAAC^ 
CGGCAACAACCAGCTGCCACCGAAAGGTAACGAGCAACTGAAGAG 

GATGGAAGGTAACTTAGATTTCAAAAATCACGACCTTCCTATAACTCGTATCAAGAAGAT 
TATGAAGTATGATCCGGATGTGACTATGATAGCTAGTGAGGCTCCAATCCTCCTCTCGAA 
AGCATGTGAGATGTTTATCATGGATCTCACGATGCGTTCGTGGCTCCATGCTCAGGAAAG 
CAAACGAGTCACGCTACAGAAATCTAATGTCGA^ 

TGATTTCTTGCTTGATGATGACATTGAGGTAAAGAGAGAGTCTGTTGCCGCCGCTGCTGA 

TCCTGTGGCCATGCCACCTATTGACGATGGAGAGCTGCCTCCAGGAATGGTAATTGGAAC 

TCCTGTTTGTTGTAGTCTTGGAATCCACCAACCACAACCACAAATG^^ 

AGCTTGGACCTCGGTGTCTGGTGAGGAGGAAGAAGCGCGTGGGAAAAAAGGAGGTGACGA 

CGGAAACTAATAAGTGGAATACGTTTTAGGGTATTTTCAAGGGAATATGTAGTAAATAGT 

CATGGATC 

>G1836 Amino Acid Sequence (domain in AA coordinates: 30-164) 
MENNNGl^QLPPKGNEQLKSFWSKErcEGN^^ 

ILL S KACEMF I MDLTMRS WLHAQE S KRVTLQKSNVD AAVAQTV I FDFLLDDD I E VKRE S V 
AAAADPVAMPPIDDGELPPGMVTGTPVCCSLGIHQPQPQMQAWPGAWTSVSGEEEEARGK 

KGGDDGN* 

>G1838 (132.. 1628) 
TTCCTTGGCATTCTCTTTAGAACTTTCGT^ 

AAAAAAAAGATTAGAGACTGTAACTGCTTTTATCAGATTTTCAACTAGGAAAAAAGTTAC 

AGATGTTGAAATCAACTGATC^GTCTCACTTCTCTTCTTCTTACGACGATTCTTCTACTC 
CTTATCTCATCGATAACTTCTATGCTTTCAAAGAAGAAGCTGAGATAGAAGCTGCTGCTG 
CTTCAATGGCGGATTCAACAACCTTATCTACTTTTTTCGATCATTCTCAGACTCAGATTC 
CAAAGCTGGAAGATOTCCTCGGTGATTCCTTTGTCCGTTACTCTGATAACCAAACAGAGA 
CCCAAGACTCTTCTTCTCTCACTCCATTCTACGATCCACGTCACCGCACCGTTGCCGAAG 
GAGTTACAGGGTTCTTCTCTGATCATCATCAGCCAGATTrCAAGACGATAAACTCGGGAC 
CAGAAATCTTCGATGACTCAACAACTTCCAAGAT 

TGGTGGAGTCATCAACGACGGCGAAGTTAGGGTTTAACGGTGATTGCACCACCACCGGAG 
GAGTTTTGTCTCTAGGGGTTAACAACACATCAGATCAACCTTTGAGCTGTAACAATGGCG 
AGAGAGGTGGAAACAGTAACAAGAAGAAAACAGTTTCTAAGAAGGAAACATCAGATGATT 
CAAAGAAGAAGATTGTCGAAACATTGGGACAAAGAACTTCAATTTATCGTGGAGTCACCC 
GACATAGATGGACTGGAAGATACGAAGCGCATCTATGGGATAACAGCTGTAGGAGGGAAG 
GTCAAGCCAGAAAAGGACGTCAAGTGTACTTAGGTGGATATGACAAGGAAGATAGAGCAG 
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CTAGAGCCTATGACTTGGCAGCTTTAAAATACTGGGGTTCTACTGCTACTACAAATTTTC 
CGGTCTCGAGTTATTCAAAAGAACTTGAGGAAATGAATCACATGACCAAGCAAGAGTTTA 
TTGCATCTCTTAGGAGGAAAAGTAGCGGTTTTTCGAGAGGAGCTTCAATATATAGAGGTG 
TCACAAGGCATCATCAACAAGGTCGCTGGCAAGCAAGAATCGGCCGTGTCGCAGGAAAC^ 
AAGATCTTTACCTCGGAACCTTTGCAACCGAAGAGGAAGCAGCAGAGGCTTATGACATTG 
CAGCCATAAAGTTCAGAGGAATCAACGCAGTAACTAACTTTGAGATGAACAGGTATGACA 
TTGAAGCTGTCATGAATAGTTCTTTACCTGTAGGAGGAGCAGCTGCGAAACGCCACAAAC 
TCAAACTCGCTCTTGAATCTCCTTCTTCATCA.TCCTCTGACCATAACCTCCAACAACAAC 
AGTTGCTTCCGTCCTCTTCTCCCTCGGATCAAT^ACCCTAACTCAATCCCATGTGGCATTC 
CATTTGAGCCTTCAGTTCTCTATTACCACCAGAACTTCTTTCAGCATTATCCTTTGGTCT 
CTGACTCTACAATTCAAGCTCCTATGAACCAAGCTGAGTTTTTCTTGTGGCCTAACCAGT 
CTTACTAAATCATTTGGTTCGTTCTTGCTTAGACTTCTATTCACCGCACTAACCGATGAC 
CCGAGGCTTATCTTCTTGATTCTGGCTATAAGGATGAATCTTTCAAGTTCCTTTTTTAAC 
TGTAGGCTAAGACAGAAGTAGAGGGGAGAT^AAGTTGAAGAATCTGAAACTTTTGGGGTCA 
ATTTTGTATTAATGTTTTTCTTTTGTCAAGGGTGGATTATCGGTTTTATTACTTATTTTT 
TGAATGTAATCGGCCTATAACGGTATAACTCTGTTTCCATTTATGAATATTTTTCTCAAA 
TTGAAAAAAAAAAAAAAAAAA 

>G1838 Amino Acid Sequence (conserved domain in AA coordinates : 229-305 , 330-400) 

MAPPMTNCLTFSLSPMEMLKSTDQSHFSSSYDDSSTPYLIDNFYAFKEEAEIEAAAASI^ 

DSTTIiSTFFDHSQTQIPKLEDFIiGDSFVRYSDNQTETQDSSSLTPFYDPRHRTVAEGVTG 

FFSDHHQPDFKTINSGPEIFDDSTTSNIGGTHLSSHVVESSTTAKLGFNGDCTTTGGVIjS 

LGVNNTSDQPLSCNNGERGGNSNKKK^ 

TGRYEAHLWDNS CIU^GQARKGRQVYLGGYDKEDRAAI^YDIjAALKyWGSTATTNFPVS S 
YS KELEEMNHMTKQEFI ASLRRKS SGFSRGAS X YRGVTRHHQQGRWQARI GRVAGNKDIiY 
LGTFATEEEAAEAYD I AAI KFRG INAVTNFEMNRYDIEAVMNS SIiP VGGAAAKRHKLKLA 
LESPSSSSSDHNLQQQQLLPSSSPSDQNPNSIPCGIPFEPSVl^YYHQNFFQHYPLVSDST 
IQAPMNQAEFFLWPNQSY* 
>G1843 (51.- 653) 

CAGACATCACAATCAAATTAGGTCAGAAGAATTAGTCGGAGAAAACAGCCATGGGAAGAA 
GAAAAGTAGAGATCAAACGAATTGAGAACAAAAGCTCTCGACAAGTTACTTTCTGTT^AAC 
GACGAAATGGTCTCATGGAGAAAGCTCGTCAACTCTCAATTCTTTGTGAATCCTCCGTCG 
CTCTTATCATC^TCTCTGCCACCGGAAGACTCTACAGCTTCTCCTCAGGTGATAGCATGG 
CCAAGATCCTCAGTCGTTATGAATTAGAACAGGCTGATGATCTTAAAACCTTGGATCTAG 
AAGAAAAAACTCTTAATTATCTTTCGGACAAGGAGT^ 

TTGAAGAAGCGAAAAGCGATAATGTAAGTATAGATTGTCTAAAGTCCCTGGAAGAGCAGC 
TCAAGACTGCTCTGTCTGTAACTAGAGCTAGGAAGACAGAACTAATGATGGAGCTTGTGA 
AGACCCATCAAGAGAAGGAGAAGCTGCTGAGAGAGGAGAACCAGAGTTTGACTAACCAGC 
TTATAAAGATGGGGAAGATGAAGAAGTCTGTGGAAGCAGAGGATGCAAGAGCAATGTCAC 
CGGAAAGTAGCTCTGACAACAAGCCACCGGAGACTCTCCTGCTTCTCAAGTAACCACCAT 
CACCAACGACTGATTCGAAAAATAAAAATTGTAAAAATTATGATTTGTAGTTCATAAGGA 
AAGCTACATACTGTATGTTAAAAATCCTCTTCTTCCCCCTGCTACGGAAAAGTCATCCAA 
GGAGATGCATCAAATAAAGTAATTGATTTTTATTGTTA 

>G1843 Amino Acid Sequence (domain in AA coordinates: 2-57) 

MGRRKVEIKRIENKSSRQOTFCKRRNGLMEKAR 

DSMAKZLSRYELEQADDLKTLDLEEKTLNra 

EEQLICTALS VTRARKTELMMELVlCrHQEKEKLIjREENQSLTNQL I KMGKMKKS VEAEDAR 

AMSPESSSDNKPPETLLLLK* 

>G1853 (1..186*) 

ATGAGAGGTTCTTGGTAC^^GAGTGTTTCCTCTGTTTTTGGTCTCAGACCACGGATCAGA 
GGGTTGTTATTCTTCATTGTTGGTGTTGTGGCTCTAGTTACTATTTTAGCACCATTGACA 
TCTAATTCGTATGATTCTTCGTCAAGTTCGACACTTGTGCCGAACATTTATAGTAACTAT 
AGGAGGATAAAGGAGCAAGCTGCTGTTGATTATCTTGATCTGAGGTCTCTTTCTTTAGGG 
GCTAGTTTAAAAGAGTTTCCTTTTTGTGGTAAAGAAAGAGAAAGTTATGTGCCTTGTTAT 
AACATAACTGGGAATTTGCTTGCTGGGCTTCAAGAGGGTGAGGAGTTAGATCGACATTGC 
GAGTTTGAAAGAGAGAAGGAAAGATGTGTAGTTCGTCCTCCGAGAGATTATAAAATACCA 
CTTAGGTGGCCACTTGGTAGAGATATCATATGGAGTGGGAACGTGAAGATTACCAAAGAC 
CAGTTTCTTTCTTCAGGAACTGTGACAACGAGGTTAATGTTGCTTGAAGAGAATCAAATA 
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ACCTTTCACTCGGAGGACGGCCTGGTCTTTGATGGGGTCA 

GCTGAGATGATAGGTTTAGGAAGTGATACTGAATTTGCTCAAGCGGGTGTACGGACTGTG 

TTAGACATTGGTTGCGGATTTGGTAGCTrTTGGTGOTCATTTAGTGTCTTTGAAGCTGATC 

CCTATATGTATTGCTGAGTATGAGGCAACTGGGAGCCAAGTTCAGTTAGCTCTAGAGAGA 

GGCCTTCCTGCAATGATTGGCAATTTCTTT^ 

TTTGATATGGTCGATTGTGCTCAATGTGGCACTACTC 

CTTTTGGAAGTGGATCGTGTTCTGAAACCCGGGGGATACTTTGTTTTAACTTCTCCCACA 

AACAAAGCACAGGGAAACTTACCAGATACCAAGAAAACGAGCATCTCAACACGGGTGAAT 

GAGTTATCTAAGAAAATCTGTTGGAGTCTAACAGCTC^GCAGGATGAGACGTTTCTTTG 

CAGAAAACTTCTGATTCAAGTTGCTATTCTTCTCGTTCGCAAGCTTCTATACCTCTTTGC 

AAAGATGGAGATAGCGTTCCGTATTACCACCCATTGGTTCCATGTATAAGCGGAACCACG 

AGTAAACGCTGGATTTCTATACAGAACAGGTCTGCTGTTGCAGGAACAACCTCTGCCGGG 

CTTGAAA1TCATGGTTTAAAACCGGAAGAATTCTTCGAGGATACACAAAT 

GCTCTGAAAAACTATTGGTCCTTGCTTACACCTCTAATTTTCTCTGACCATCCGAAGAGA 

CCCGGTGATGAGGATCCTCTCCCGCCTTTCAAC3ATGATACGCAATGTGATGGACATGCAT 

GCTCGTTTTGGGAATTTAAATGCCGCTTTACTCGACGAAGGAAAATCTGCTTGGGTAATG 

AACGTCGTCCCAGTCAATGCACGTAATACTCTTCCTATCATACTTGATCGTGGTTTCGCC 

GGTGTTCTACATGACTGGTGTGAACCATTCCCGACATATCCTCGAACATATGACATGCTT 

CATGCCAATGAACTTCTCACACATCTTAGCT 

TTGGAGATGGACCGGATTCTTCGCCCTGAGGGATGGGTTGTTCTAAGCGACAAAGTGGGA 
GTAATCGAGATGGCTCGAGCACTTGCAGCTCGAGTGCGTTGGGAAGCAAGAGTCATTGAT 
CTTCAAGATGGTAGTGACCAAAGACTTCTCGTCTGTCA2^^ 

>G1853 Amino Acid Sequence (domain in AA coordinates: entire protein) 
MRGSWYKSVS S VFGLRPRIRGIiLFFI VGVVALVTILAPIiTSNS YDS S S S S TLVPNI YSNY 
RRIKEQAAVDYIiDLRSLSLGASLKEFPFCGKERESYVPC^^ 

EFEREKERCWRPPRD YKI PLRWPLGRD 1 I WSGNVKITKDQFLS SGTVTTRIjMIiIjEENQI 
TFHSEDGDVFDGVKDYARQIAEMIGLGSDTEFAQAGVRTVLDIGCGFGSFGAHIjVSIjKLM 
P I CI AEYEATGSQVQIiALERGIiPAMI GNFFSKQIiPYPALSFDMVHCAQCGTTWDIKDAML 
LLEVDRVLKPGGYE^TSPTNKAQGNLP 

QKTSDSSCYSSRSQAS I PLCKDGDSVPYYHPLVPCISGTTSKRWI S IQNRS AVAGTTS AG 

LEIHGLKPEEFFEDTQIT^SAIjKNYWSLLTPLIFSDHPKRPGD 

ARFGNLNTU^IjDEGKSAWVMNVVPWARNTLPIILDRGFA 

HANELLTHLSSERCSIiMDLFIiEMDRI^^ 

LQDGSDQRLLVCQKPFIKK* 

>G1855 (1..1902) 

atggcgaaagagaacagtggtcatcatcaccaaacagaago^gaagaaagaaactaact 
ttgattcttggtgtaagtggactctgcattttgttctatgttttaggtgcatggcaagcc 
aataccgtcc(^tcttctatctcgaagctcggatgcgagacgcaatcaaacccttcttcg 
tcctcttcctcttcctcatcttcagagt 

attgagttaaaggaaacaaaccaaaccattaagtactttgaaccatgtgaattatctctc 
agtgagtacactccttgtgaagaccgacaaagaggaagaagattcgataggaacatgatg 

AAATATAGAGAAAGACATTGTCCTGTAAAAGATGAGCTTCTTTATTGTTTGATTCCTCCT 

CCACCAAACTACAAGATTCCATTTAAATGGCCACAAAGTAGAGACTATGCTTGGTATGAC 

AATATCCCTCACAAGGAACTTAGTGTTGAGAAAGCAGTTCAAAACTGGATTCAAGTTGAA 

GGTGACCGCTTTAGATTCCCTGGTGGTGGTACTATGTTTCCTCGTGGAGCTGATGCTTAT 

ATCGATGATATTGCTAGGCTTATTCCTCTTACTGATGGTGGAATCAGAACAGCTATTGAC 

ACTGGATGTGGTGTTGCAAGTTTTGGTGCTTACCTCTTGAAGAGAGACATTATGGCTGTG 

TOTTTTGCTCCAAGAGACACTCATGAAGCTC^GGTACAGTTTGCTTTAGAACGCGGAGTT 

CCTGCGATAATCGGGATTATGGGATCAAGAAGACTTCCTTATCCAGCTAGAGCTTTTGAT 

CTTGCTCATTGTTCTCGTTGTTTGATCCCTTGGTTTAAAAATGATGGTTTGTAC 

GAGGTCGACCGGGTTTTAAGACCGGGCGGTTACTGGATCCTCTCGGGACCACCGATTAAC 

TGGAAACAGTACTGGAGAGGGTGGGAGAGAACAGAGGAGGATTTGAAGAAAGAGCAAGAT 

TCAATAGAAGATGTAGCAAAGAGTCTTTGC^GGAAGAAAGTAACTGAAAAAGGTGACTTA 

TCAATTTGGCAAAAGCCTCTCAATCACATTGAGTGT 

TCACCTCCGATATGCAGCTCAGATAACGCGGATTCCGCTTGGTACAAAGACTTGGAAACT 
TGTATAACACCATTACCAGAAACAAACAATCCAGATGATTCAGCAGGCGGTGCACTCGAG 
GATTGGCCAGACCGAGCATTCGCGGTACCTCCAAGAATCATCAGAGGAACTATACCAGAA 
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ATGAACGCGGAGAAATTTAGAGAAGACAACGAGGT1TGGAAAGAGAGAATAGCACATTAC 
AAGAAGATAGTCCCTGAGCTTTCACATGGAAGATTCAGGAACATTATGGACATGAACGCT 
TTTCTCGGCGGATTCGCTGCTTCCATGCTGAAATATCCCTCATGGGTCATGAACGTTGTC 
CCGGTCGATGCAGAGAAACAAACGTTAGGTGTGATCTACGAACGTGGATTGATAGGGACG 
TATCAAGATTGGTGTGAAGGATTCTCAACGTATCCAAGAACTTATGATATGATTCATGCA 
GGAGGATTGTTCAGCTTATACGAACATAGGTGTGATTTGACGTTGATATTGTTGGAGATG 
GATCGAATTTTGAGACCAGAAGGAACAGTTGTGTTGAGAGATAATGTGGAGACGTTGAAT 
AAGGTAGAGAAGATAGTGAAGGGAATGAAGTGGAAGAGTCAAATTGTTGATCATGAGAAA 
GGTCCTTTTAATCCTGAGAAGATTCTTGTTGCTGTTAAAACTTATTGGACTGGTCAACCT 
TCTGACAAGAACAACAACAACAACAACAACAACAACAACTAG 

>G1855 Amino Acid Sequence (domain in AA coordinates : entire protein) 
MAKENSGHHHQTEARRKKLTIj ILGVSGLCILF YVLGAWQANTVPS SIS KLGCETQSNP S S 
SSSSSSSSESAEIiDFKSHNQIELKBTNQTIKYFEPCELSLSEYTPCEDRQRGRRFDRNMM 
KYRERHCPVKDELLYCLI PPPPNYKI PFKWPQSRDYAWYDNIPHKELSVEKAVQNWIQVE 
GDRFRFPGGGTMFPRGADAYIDDIARLIPLTIX3GIRTAIDTGCGVASFGAYLLKRDIMAV 
S FAPRDTHEAQVQFAIjERGVPAI I GIMGSRRLPYPARAFDIiAHCSRCIi I PWFKNDGIiYTiM 
EVDRVLRPGGYWILSGPPINWKQYWRGWERTEEDLKKEQDSIEDVAKSLCWKKVTEKGDL 
SIWQKPIjNHIECKKLKQNNKSPPICSSDNADSAWYKDLETCITPIjPETNNPDDSAGGALE 

dwpdrafavppriirgtipemnaekfrednewkeriahykkivpelshgrfrnimdmna 
flggfaasmlctpswvmnvvpvdaekq^ 

GGLFSLYEHRCDIjTLILLEMDRILR 
GPFOTEKILVAVKTYVTOSQPSDKNNN^ 
>G187 (118.. 1074) 

TAGACCTCTTAGGAAAAAAACCTAAAAACCTAATCCCCAAACCTAAAAGGCTTATCTCAT 

CTCTTCTTCTTTGTCTTCTTTACTCTTTTTTTACCTCTCTCTTCATTGTTCTTCA 

TCTAATGAAACCAGAGATCTCTACAACTACCAATACCCTTCATCGTTT^CGTTGCACGAA 

ATGATGAATCTGCCTACTTCAAATCCATCTTCTTATGGAAACCTCCCATCACAAAACGGT 

TTTAATCCATCTACTTATTCCTTCACCGATTGTCTCCAAAGTTCTCCAGCAGCGTATGAA 

TCTCTACTTGAGAAAACTTTTGGTCTTTC^ 

ATCGATCAAGAACCGAACCGTGATGTTACTAATGACGTAATCAATGGTGGTGCATGCAAC 
GAGACTGAAACTAGGGTTTCTCCTTCTAATTCTTCCTCTAGTGAGGCTGATCACCCCGGT 
GAAGATTCCGGTAAGAGCCGGAGGAAACGAGAGTTAGTCGGTGAAGAAGATCAAATTTCC 
AAAAAAGTTGGGAAAACGAAAAAGACTGAGGTGAAGAAACAAAGAGAGCCACGAGTCTCG 
TTTATGACTAAAAGTGAAGTTGATCATCTTGAAGATGGTTATAGATGGAGAAAATACGGC 
CAAAAGGCTGTAAAAAATAGCCCTTATCCAAGGAGTTACTATAGATGTACAACACAAAAG 
TGCAACGTGAAGAAACGAGTGGAGAGATCGTTCCAAGATCCAACGGTTGTGATTACAACT 
TACGAGGGTCAACACAACCACCCGATTCCGACTAATCTTCGAGGAAGTTCTGCCGCGGCT 
GCTATGTTCTCCGCAGACCTCATGACTCCAAGAAGCTTTGCACATGATATGTTTAGGACG 
GC^GCTTATACTAACGGCGGTTCTGTGGCGGCGGCTTTGGATTATGGATATGGACAAAGT 
GGTTATGGTAGTGTGAATTCAAACCCTAGTTCTCACCAAGTGTATCATCAAGGGGGTGAG 
TATGAGCTCTTGAGGGAGATTTTTCCTTCAAT^ 

ATTGTTATAACTACATATATTATATATATTGAGAGAGAGAGGTAGAGAAAAAAAAA 
>G187 Amino Acid Sequence (domain in AA coordinates: 172-228) 
MSNETRDLYNYQYPSSFSLHEMMNLPTSNPSSYGNLPSQNGFNPSTYSFTDCLQSSPAAY 
ESLLQKTFGLSPSSSEVFNSS idqepnrdvtndvinggacnetetrvspsns s s SEADHP 
GEDSGKSRRKRELVGEEDQISKKVGKTKKTEVKKQREPRVSFMTKSEVDHLEDGYRW 
GQKAVKNSPYPRSYYRCTTQKOtfVKKRVERSFQDPO^ 

AAMFSADLMTPRS FAHDMFRTAAYTNGGSVAAALDYGYGQSGYGSVNSNPS SHQVYHQGG 
E YELLRE I FPS I FFKQEP * 
>G1881 (1. .519) ' 

ATGCGAATTTTGTGTGATGCTTGTGAGAGCGCCGCCGCTATCGTCTTTTGCGCCGCCGAC 
GAAGCTGCCCTCTGTTGCTCCTGCGACGAAAAAGTTCATAAGTGC^CAAGCTGGCTAGT 
CGGCATCTTCGTGTAGGCTTAGCTGATCCGAGTAATGCACCAAGCTGTGACATATGCGAA 
AATGCACCCGCATTCTTTTACTGTGAGATAGATGGTAGTTCCCTTTGTCTACAATGTGAT 
ATGGTGGTACATGTTGGTGGGAAGAGAACACATAGGCGGTTTCTATTACTGAGACAGAGA 
ATTGAGTTTCCAGGCGATAAGCCTAATCATGCTGACCAACTGGGACTACGGTGTCAAAAG 
GCTTCCTCTGGTCGTGGTCAAGAATCAAATGGGAATGGTGATCATGATCATAATATGATC 
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GATCTTAACTCCAATCCTCAAAGAGTACACGAGCCTGGATCACATAACCAAGAGGAGGGT 
ATTGATGTAAATAACGCAAACAATCACGAGCATGAATAG 

>G1881 Amino Acid Sequence (domain in AA coordinates : 5-28 , 56-79) 
I^ILCDACESAAAIVFCAADEAALCCSCDEKVHKCNKLASRHLRVGLADPSNAPSCDICE 
NAPAFFYCEIDGSSLCLQCDMWHVGGKRTHRRFIjIjIiRQRIEFPGDKPITEIADQ 
ASSGRGQESNGNGDHDHNMIDLNSNPQRVHEPGSHNQEEGIDVNNANNHEHE* 

>G1882 (1.-1200) 

ATGGTTTTTTCTTCATTTCCTACTTATCCTGATGATTCATCAAACTGGCAACAACAACAT 
CAACCAATCACAACCACCGTTGGATTCACGGGAAATAACATCAACCAACAGTTTCTTCCT 
CACGATCCCCTCCCACCGCAACAGC^^ 

AACGGCGGAGTCGCTGTTCCCGGTGGACCTGGCGGGTTAATCCGACCAGGTTCGATGGCG 

GAAAGAGCAAGGCTAGCCAACATACCZATTACCTGAAACAGCCTTGAAGTGTCCAAGATGT 

GACTCAACTAACACCAAATTCTGTTACTTCAACAACTACAGTCTCACTC^ 

TTCTGCAAAGCATGCCGTCGTTACTGGACACGTGGCGGTGCTCTAAGGAGCGTTCCCGTC 

GGTGGCGGTTGCCGTAGAAACAAAAGAACCAAAAACAGCAGCGGTGGAGGTGGCGGTAGC 

ACCAGTAGCGGTAACAGCAAGTCACAAGACAGCGCCACGAGCAACGACGAATA 

CGAGCC7VTGGCTAACAATCAGATGGGACCACCTTCTTCGTCATCGTCTCTAAGCTCGTTG 

CTGTCTTCTTACAACGCAGGGTTAATCCCCGGACATGATCATAACAGCAATAACAACAAC 

ATACTTGGACTTGGATCATCTTTGCCTCCTCTTAAGCTTATGCCTCCTTTAGACTTCACA 

GACAACTTCACCTTACAATACGGTGCCGTTTC^GCTCCTrCTTATCATATAGGCGGTGGA 

AGCAGTGGAGGAGCGGCGGCTCTTTTAAACGGTTTTGACCAGTGGAGATTCCCGGCAACA 

AACCAACTTCCTTTAGGCGGTTTAGACCCGTTTGATCAACAAGATCAAATGGAGCAG 

AATCCAGGTTACGGATTGGTTACCGGGTCGGGTCAGTATCGACCTAAGAACATTTTCCAT 

AACCTTATCTCCTCTTCTTCGTCTGCTTCATCAGCTATGGTTACAGCCACCGCGTCGCAA 

TTAGCTTCAGTGAAAATGGAAGATAGTAACAATCAGCT 

GGAGACGAACAACAGCTCTGGAATATTCATGGCGCTGCTGCAGCATCCACCGCAGCTGCA 
ACAAGTTCGTGGAGTGAAGTCTCTAATAATTTC^GTTCTTCTTCTACTAGCAATATATAA 
>G1882 Amino Acid Sequence (domain in AA coordinates : 97-125) 
MVFSSFPTYPDHSSl^QQQHQPITTTVGFTC 

NGGVAVPGGPGGLIRPGSMAERARIiANIPLPETAIjKCPRCDSTNTKFCYFNNYSLTQP 

FCKACRRY^miGGALRSVPVGGGCRRNKRTKNSSGGGGGSTSSGNSKSQDSATSIiroQYHH 

RAMAI^QMGPPSSSSSLSSLLSSYNAGLIPGHDI^ 

DNFTLQYGAVSAPSYHIGGGSSGGAAALLNGFIX^WRFPATNQLPIjGGIjDPFDQQHQMEQQ 
NPGYGLVTGSGQYRPKNIFHNLISSSSSASST^MVTATASQIiASVKMEDSNNQLNLS 
GDEQQLWNIHGAAAASTAAATSSWSEVSNNFSSSSTSNI* 
>G1883 (1..1110) 

ATGGACGCTACGAAGTGGACACAGGGTTTTCAAGAAATGATGAACGTTAAACCAATGGAG 
CAGATCATGATTCCTAATAAC AACAC ACATCAAC CAAACAC CACATCCAATG CAAGGCCA 
AACACCATTCTCACATCTAACGGCGTCTCAACTGCTGGAGCAACCGTCTCCGGCGTAAGC 
AACAACAATAACAATACGGCGGTTGTGGCGGAGAGGAAAGCAAGACCACAAGAGAAACTA 
AATTGTCGAAGATGCAACTCAACGAACAC^^ 

ACACAACCAAGATACTTCTGCAAAGGTTGTCGAAGGTATTGGACCGAAGGTGGATCTCTT 
AGGAATGTTCCTGTGGGAGGAAGCTCAAGAAAGAACAAGAGATCATCTTCATCTTCTTCA 
TGAAAGATCCTTCAGACAATACCATCT^ 

TCAAACCAAATCCATAATAAATCGAAAGGGTCATCACAAGATCTCAACTTGTTGTCTTTC 
CCAGTGATGCAAGATCAACATCATCATCATGT^ 

AAGATGGAGGGAAATGGTAACATAACT(^TCAG(^GCAGCCTTCATC^TCTTCTTCTGTC 

TATGGTTCCTCGTCeTCTCCTGTTTCAGCTCTTGAACTTTTAAGAACCGGAGTTAATGTT 

TCTTCAAGATCAGGGATTAACTCATCGTTCATGCCTTCCGGTTCAATGATGGATTCAAAC 

ACTGTGCTTTACACTTCTTCAGGGTTTCCAACAATGGTGGATTACAAGCCAAGTAATCTC 

TCCTTCTCTACCGATCATCAAGGGCTTGGACACAATAGCAACAATAGGTCTGAAGCTCTT 

GATAGTGATCATCACCAACAAGGTAGAGTTTTGTTTCCATTTGGGGATCAAATGAAGGAG 

CTTTCATCAAGCATAACACAAGAAGTTGATCATGATGATAATCAACAACAGAAG 

GGAAATAATAATAATAATAATAACTCAAGCCCTAATAATGGATATTGGAGTGGGATGTTC 

AGTACTACAGGAGGAGGATCTTCATGGTGA 

>G1883 Amino Acid Sequence (domain in aa coordinates: 82-124) 
MDATKWTQGFQEMMNVKPMEQIMIPNNNTHQPNTTSNARPOT 
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NNNNNTAWAERKARPQEKLNCP 

RNVPVGGSSRKNKRSSSSSSSNIIiQTIPSSLPDLNPPILFSNQIHNKSKGSSQDIiNLLSF 
PVMQDQHHHHVHMSQFLQMPKMEGNGNira^ 

SSRSGINSSFMPSGSMMDSNTVLYTSSGFPTMVDYKPSNLSFSTDHQGIjGHNSNNRSEAIj 
HSDHHQQGRVLFPFGDQMKEIjSSSITQBVDHDDNQQQKSHGNNWIJN^ 

sttgggssw* 

>G1884 (1..741) 

ATGATGACGTCATCCCATCAGAGCAACACC^CCGGCTTTAAACCGCGGCGGATCAAGACG 
ACGGCGAAGCCACCACGTCAGATCAATAACAAAGAACCATCTCCGGCGACGCAGCCGGTG 
CTCAAGTGTCCGAGATGTGATTCAGTC^C 

TTGTCTCAGCCACGTCACTACTGCAAGAACTGTCGTCGTTACTGGACACGTGGCGGCGCC 
CTCCGTAACGTTCCCATCGGTGGCTCCACTCGAAACAAGAACAAGCCTTGCAGCCTCCAA 
GTCATCTCTTCTCCTCCTTTGTTCTCGAACGGGACGTCATCGGCGTCTCGTGAGCTTGTA 
AGAAACCATCCATCGACGGCAATGATGATGATGAGTTCTGGTGGATTCTCCGGCTATATG 
TTTCCGTTGGATCCTAACTTCAACCTTGCCTCGTCTTCTATCGAGTCTTTGAGTTCTTTT 
AACCAAGATTTGCACCAGAAGCTTCAGCAACAAAGA 

GATTCTCT^CCGGTTAACGAGAAAACGGTTATGTTTCAGAACGTAGAGTTGATTCCTCCT 
TCGACGGTGACGACGGATTGGGTTTTCGATAGGTTCGCCACTGGAGGAGGTGCAACAAGT 
GGCAATCATGAAGATAATGATGATGGGGAGGGTAATTTGGGAAATTGGTTCCATAATGCT 
AATAATAATGCTCTGCTCTAA 

>G1884 Amino Acid Sequence (domain in AA coordinates : 43 -71) 
MMTSSHQSNTTGFKPRRIKTTAKPPRQINNKEPSP 

LSQPRHYCKNCRRYWTRGGAIiRlTVPIGGSTRNKNKPCSLQVI S S PPLFSNGTSSASRELV 

RNHPSTAMMMMSSGGFSGYMFPLDPNFNLASSSIESLSSFN^QDLHQKLQQQRLVT 

DSLPViraKTVMFQNVBDIPPSTVTTDWFDRFA 

NNNALL* 

>G1891 (1..750) 

ATGGATAACTTGAATGTTTTCGCAAATGAAGACAATCAAGTGAATGATGTGAAGCCCCCA 

CCACCACCACCTCGAGTGTGTGCAAGGTGTGATTCTGATAATACTAAATTTTGTTATTAC 

AACAACTACTGTGAGTTTCAGCCACGATACTTCTGCAAGAACTGTCGTAGATACTGGACT 

CATGGTGGGGCTTTJ^GAAACATACCAATTGGTGGAAGTAGTCGTGCCAAACGGG 

GTAAATCAACCTTCGGTTGCTCGGATGGTTTCTGTTGAGACCCAACGAGGTAACAATCAA 

CCTTTCTCTAATGTTCAAGAAAACGTTCATCTTGTTGGATCTTTTGGTGCTTCATCTTCA 

TCTTCTGTTGGTGCTGTTGGGT^CCTTTTTGGTTCTTTGTATGATATTCATGGTGGTATG 

GTAACAAATTTGCATCCAACTCGAACTGTT 

GGATCATTTGAGCAAGACTATTACGATGTTGGGTCCGATAATCTTTTGGTCAACCAACAA 
GTTGGTGGCTACGGTTATCACATGAATCCAGTGGATCAATTCAAGTGGAACCAGAGCTTC 
AACAACACTATGAACATGAATTATAATAACGATAGCACTAGTGGAAGTAGCAGAGGATCT 
GACATGAATGTGAACCATGATAACAAGAAGATCAGATACCGCAACTCTGTGATTATGCAT 
CCTTGTCATCTGGAGAAGGATGGTCCTTGA 

>G1891 Amino Acid Sequence (domain in aa coordinates: 27-69) 

HGGALRNI PIGGS SRAKRARVNQPSVARMVS VETQRGNNQPFSITVQENVHIjVGS FGAS S S 

S SVGAVGNLFGSIiYD IHGGMVTNLHPTRTVRPNHRIiAFHDGS FEQDYYDVGSDNIJjVNQQ 

VGGYGYHMNPVDQFKWNQSFNNTMNMl^^ 

PCHLEKDGP* 

>G1896 (1..951) 

ATGTCCTCCCATAe€AATCTCCCCTCTCCCAAACCAGTTCCTAAACCAGATCACCGTATC 

TCCGGTACATCCCAAACCAAGAAACCACCGTCTTCCTCCGTAGCTCAAGACCAACAAAAC 

CTAAAATGCCCTCGTTGCAACTCTCCAAACACAAAGTTCTGTTACTACAACA 

CTCTCTCAACCTCGTCACTTCTGCAAATCTTGTCGCCGTTACTGGACACGTGGCGGTGCT 

CTAAGAAACGTCCCCATCGGTGGTGGTTGCCGGAAAACCAAAAAATCTATCAAACCTAAT 

TCCTCCATGAACACACTTCCTTCGTCTTCTTCCTCTCAGAGGTTCTTCTCATCAATCATG 

GAAGATTCATCCAAATTCTTCCCTCCTCCGACAACAATGGATTTTCAGCTGGCCGGATTA 

TCTCTCAACAAAATGAACGATCTTCAACTTTTGAATAACCAAGAAGTTCTTGATCTTAGG 

CCCATGATGTCCTCGGGCCGAGAAAACACACCCGTTGATGTCGGGTCGGGTTTATCCCTA 

ATGGGTTTTGGAGATTTCAACAACAACCATTCACCGACGGGGTTCACAACCGCCGGAG 



101 



WO 03/013227 PCT/US02/25805 

102/286 



AGCGACGGAAACTTAGCTTCTTCTATAGAGACTTTG^ 

TGGAGGCTTGAGCAACAGAGGATGGCGATGCTTTTTGGTAATTCTAAGGAAGAAACTGTT 

GTCGTCGAGAGGCCACAACCTATTCTTTATCGGAATCTTGAGATCGTAAACTCATCATCG 

CCGTCGTCGCCGACGAAGAAAGGAGATAATCAGACAGAGTGGTATTTTGGTAATAACAGT 

GATAATGAAGGAGTGATTAGTAATAATGCTAATACAGGAGGAGGAGGAAGTGAATGGT^AC 

AATGGAATTCAAGCTTGGACTGATCTTAATCATTATAATGCATTGCCTTGA 

>G1896 Amino Acid Sequence (domain in aa coordinates: 43-85) 

MSSHTlfoPSPKPVPKPDHRISGTSQTKKPPSSSVA 

LSQ PRHFCKS CRRYWTRGGALRNVP IGGGCRKTKKS I KPNS SMNTLP S S S S S QRFFS S IM 

EDSSKFFPPPTT^FQLAGLSLNKMiroLQLIjNNQEVIjDLR^ 

MGFGDFNNNHSPTGFTTAGASDGNLASSIETLSCI^ 

VVERPQPILYRNLEIWSSSPSSPTKKGDNQTEWYFGimSDNEGVISNNANTGGGGSEWN 

NGIQAWTDLNHYNALP* 

>G1898 (1..630) 

ATGCCGTCGGAACCAAACCAAACCCGACCCACCAGAGTTCAGCCCTCAACGGCGGCTTAC 
CC^CCGCCAAATCTGGCTGAGCCTCTTCCT^ 

TTCTGTTACTACAACAACTATAACCTCGC^CAGCCTCGCTACTA.CT^CAAATCTTGCCG^ 
CGTTACTGGACTCAAGGTGGTACACTCCGTGACGTCCCCGTCGGTGGTGGAACTCGTCGA 
AGCTCCTCAAAACGTCACCGTTCTTTCTCCACCACTGCCACCTCCTCn^CCTCCTCTTCT 
TCCGTCAT(^CCACC^CGACACAAGAACCAGCCACGACTGAAGCGAGTCAAACTAAGGTT 
ACTAATTTAATTTCAGGTCATGGAAGCTTTGCTTCTCTGTTAGGTTTAGGAAGTGGAAAT 
GGTGGGTTGGATTACGGGTTTGGGTACGGGTACGGGCTTGAGGAGATGAGTATTGGGTAT 
CTTGGAGATTCTTCCGTAGGAGAGATTCCGGTGGTTGATGGTTGTGGTGGTGACACGTGG 
CAGATTGGGGAGATTGAAGGTAAAAGTGGAGGAGACAGTTTGATATGGCCTGGTCTTGAG 
ATCTCAATGCAAACCAACGATGTTAAGTGA 

>G1898 Amino Acid Sequence (domain in AA coordinates : 31-59) 
MPSEPNQTRPTRVQPSTAAYPPPNLAEPLPCPRCNSTTTK^ 

RYWTQGGTIiRDVPVGGGTRRS S S KRHRS FS TTATS S S S S S S VITTTTQEP ATTEASQTKV 
TNLISGHGSFASLLGLGSGNGGLDYGFGYGYGLEEMSIGYIjGDSSVGEIPVVDGCGGIXTO 
QIGEIEGKSGGDSLIWPGLEISMQTNDVK* 
>G1902 (1..615) 

ATGCAGGATCCAGCAGGATATTACCAGACGATGATGGCGAAGCAACAACAACAAGAACAA 
CCACAGTTTGCAGAGCAAGAACAGTTAAAGTGTCCTCGTTC 

TTCTGTTACTACAACAACTACAATCTCTCACAGCCTCGTCACTTT.TGCAAAAGCTGTCGT 
CGTTACTGGACTAAAGGCGGCGCTCTCCGTAACGTTCCCGTCGGTGGTGGTTCTCGTAAG 
AACGCAACCAAACGATCCACTTCTTCTTCTTCTTCTGCTTCCTCTCCTTCCAACAGTAGC 
CAAAACAAGAAGACGAAAAACCCGGATCCGGATCCTGATCCACGTAATTCTCAAAAACCG 
GATTTGGATCCGACCCGGATGCTTTACGGGTTTCCGATCGGTGACCAAGACGTGAAGGGT 
ATGGAGATTGGTGGAAGCTTTAGCTCGTTGTTGGCGAATAATATGCAGCTTGGTCTTGGA 
GGAGGAGGGATCATGCTTGACGGGTCGGGTTGGGATCATCCGGGTATGGGTTTGGGTTTG 
AGGAGAACCGAACCGGGTAATAATAATAATAACCCATGGACCGATCTGGCTATGAACAGA 
GCGGAGAAAAACTGA 

>G1902 Amino Acid Sequence (domain in AA coordinates : 31-59) 
MQDPAAYYQTMMAKQQQQQQPQFAEQEQL^ 

RYWTKGGALRNVPVGGGSRKNATKRSTS S S SS AS S PSNS SQNKKTKNPDPDPDPRNSQKP 
DLDPTRMLYGFPIGDQDVKGMEIGGSFSSLI1ANNMQI1GLGGGGIMLDGSGWDHPGMGLGL 
RRTEPGNNNNNPWTDIiAMNRAEKN * 
>G1904 (1..924-)- 

ATGCAAGATATTCATGATTTCTCCATGAACGGAGTTGGTGGTGGGGGAGGAGGAGGAGGG 
AGGTTTTTCGGTGGAGGAATCGGCGGCGGAGGAGGTGGTGATCGAAGGATGAGAGCTCAT 
CAGAACAATATACTTAACCATCATCAATCTCTCAAGTGTCCTCGTTGTAATTCTCTTAAC 
ACAAAGTTCTGTTACTACAACAATTACAATCTT^ 

TGTCGTCGTTACTGGACTAAAGGTGGTGTTCTCCGTAACGTTCCCGTCGGAGGTGGTTGC 
CGGAAAGCTAAACGTTCGAAAACAAAACAGGTTCCGTCGTCGTCATCAGCCGACAAACCA 
ACGACGACGCAAGATGATCATCACGTGGAGGAGAAATCGAGTACAGGATCTCACTCTAGC 
AGCGAGAGCTCTTCTCTCACCGCTTCTAACTCTACCACCGTCGCCGCCGTCTCCGTCACC 
GCGGCGGCGGAAGTTGCTTCGTCGGTTATTCCAGGTTTTGATATGCCTAATATGAAAATT 



102 



> 



WO 03/013227 PCT/US02/25805 

103/286 



TACGGTAACGGGATCGAGTGGTCGACGTTACTTGGACAAGGCTCATCGGCCGGTGGTGTT 
TTCTCGGAGATCGGTGGTTTTCCGGCGGTTTCAGCTATTGAAACTACACCGTTTGGATTC 
GGGGGTAAATTCGTAAATCl^AGATGATCATCTGAAGTTAGAAGGTGAAACTGTACAGCAG 
CAACAGTTTGGAGATCGAACGGCTCAGGTTGAGTTTCAAGG 

ATGGGATTTGAACCGTTGGATTGGGGAAGTGGCGGTGGAGATCAAACACTGTTTGATTTA 
ACCAGTACCGTTGATCATGCATACTGGAGTCAAAGTCAATGGACGTCGTCTGACCAAGAT 
CAGAGTGGTCTCTACCTTCCTTGA 

>G1904 Amino Acid Sequence (domain in aa coordinates: 53-95) 

MQDIHDFSMNGVGGGGGGGGRFFGGGIGGGGGGDRRMRAHQ 

TKFCYYNNYNLSQPRHFCKNCRRW^ 

TTTQDDHHVEEKS STGSHS S SBS S S LTASNSTTVAAVSVTAAAEVAS S VI PGFDMPNMKI 

YGNGIEWSTIjLGQGSSAGGVFSBIGGFPAVSAIETTPFGFGGKFVNQDDHLKLEGETVQQ 

QQFGDRTAQVEFQGRSSDPI^GFEPIiDWGSGGGDQTLFDIiTSTVDHAYWSQSQWTSSDQD 

QSGLYLP* 

>G1906 (1..795) 

ATGGTGGAACGTGCTCGGATCGCAAAAGTCCCATTGCCTGAAGCAGCTCTAAATTGCCCT 

AGATGTGACTCAACCAATACTAAGTTCTGTTACTTCAATAACTATAGCCTTACTCAAC 

CGCCATTTCTGCAAAACATGTCGTCGCTATTGG 

CCTGTTGGAGGAGGCTTTAGGAGGAACAAGAGAAGCAAATCCAGATCX3AAATCTACGGT^ 
GTGGTCTCGACTGATAATACTACTAGTACT^ 

AACCCTAGCAAGTTTCATAGCTACGGTCAAATCCCGGAGTTTAATTCCAACTTGCCCATC 

TTGCCTCCTCTCCAAAGCCTTGGAGATTACAATTCAAGCZAACACTGGATTAGA 

GGAACTCAAATAAGCAACATGATAAGTGGTATGAGTTCTAGTGGTGGGATCTTGGATGCA 

TGGAGAATACCTCCATCACAACAAGCTGAGCAAT^ 

TTGGTGCAATCTTCAAACGCGTTATATC(^TTACT^ 

ACAAGAAATGTGAAGGCGGAAGAGAATGATCAGGATCGGGGTAGGGATGGGGATGGAGTG 

AATAACTTATCAAGAAACTTTI^GGGTAATATCAACATAAAC 

TACACATCATGGGGAGGTAACAGTTC 

CATCTCTCATTCTAA 

>G1906 Amino Acid Sequence (domain in AA coordinates : 19-47) 
MVERARIAKVPLPEAALNCPRCDSTO^ 

PVGGGFRRHKRSKSRSKSTVWSTDNTTSTSSLTSRPSYSNPSKFHSYGQIPEFNSNLPI 

LPPLQSLGDYNSSNTGLDFGGTQISNMISGMSSSGGIIiDAWRIPPSQQAQQFPFblNTTG 

LVQSSNAIiYPLLEGGVSATQTRSTVKA^ 

YTSWGGNSSWTGFTSNWSTGHLSF* 

>G1913 (1..744) 

ATGGAGAGAGCAGAGGCCTTGACATCATCGTTTATATGGCGGCCAAACGCAAACGCAAAC 
GCGGAGATCACGCCGAGTTGTCCAAGATGTGGATCCTCTAACACAAAGTTCTGTTACTAC 
AACAACTATAGCCTCACTCAGCCTCGCTACTTCTGCAAAGGCTGCCGCAGATATTGGACC 
AAAGGTGGTTCCCTCCGCAATGTTCCTGTAGGCGGTGGCTGTCGAAAATCCCGCCGCCCC 
AAATCATCTTCTGGTAACAATACTAAAACTAGCCTAACCGCTAATTCTGGCAACCCCGGT 
GGTGGTTCACCAAGCATCGATCTTGCTCTTGTTTACGCCAATTTCTTGAATCCAAAGCCT 
GACGAATCTATACTACAAGAAAATTGCGACTTAGCCACTACGGATTTTTTGGTAGATAAT 
CCTACCGGCACTTCCATGGACCCTTCATGGAGTATGGACATCAATGATGGTCATCATGAT 
CATTATATTAATCCGGTGGAACACATTGTGGAGGAATGTGGTTATAATGGCTTGCCTCCA 
TTTCCTGGTGAAGAGCTTCTCTCTTTAGACACTAATGGTGTTTGGTCTGATGCTTTGTTG 
ATTGGTCATAACCATGTAGACGTTGGCGTGACTCCGGTTCAGGCTGTACACGAACCGGTG 
GTTCATTTCGCTGAeGAATCCAATGATTCCACCAATCTCTTGTTTGGAAGTTGGAGCCCT 
TTTGATTTCACTGCCGATGGATGA 

>G1913 Amino Acid Sequence (domain in AA coordinates: 27-55) 

MERAEALTSSFIWRPNANANAEITPSCPRCGSSNTKFCYYNNYSLTQPRYFCKGCRRYWT 

KGGSLRNVPVGGGCRKSRRPKSSSGNNTKTSLTANSGNPGGGSPSIDLALVYANFIiNPKP 

DES ILQENCDLATTDFLVDNPTGTSMDPSWSMD INDGHHDHYINPVEHIVEECGYNGLPP 

FPGEELLSLDTNGWSDALLIGHNHVDVGVT^ 

FDFTADG* 

>G1914 (1..945) 

ATGGAGAGATACAAGTGTAGATTTTGCTTCAAGAGCTTCATCAATGGAAGAGCTTTAGGT 
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GGTCACATGAGATCTCAC^TGCTTACTCTTTCTGC^GAACGTTGTGTAATAACTGGTGAA 

GCAGAAGAAGAAGTAGAGGAACGGCCGAGTCAACTCTGTGACGACGACGACGATACCGAG 

TCCGATGCTTCTTCTTCTTCTGGTGAGTTTGATAATCAAAAGATGAATCGTCTTGATGAT 

GAATTGGAGTTTGATTTCGCTGAAGACGACGACGTTGAAAGTGAAACCGAGTCGTCCAGG 

ATTAACCCAACTCGGCGACGATCTAAACGAACTCGGAAACTTGGATCGTTTGATTTCGAC 

TTTGAGAAGCT AACAACGAGC CAAC C CAGTG AGTTAGTGGC CGAG CCAGAGCATCACAGC 

TCAGCTTCTGATACAACAACGGAGGAAGATCTCGCCTTTTGTCTCATTATGCTGTCCAGA 

GAC^AATGGAAGCAACAGAAGAAGAAGAAGCAACGTGTAGAAGAAGATGAGACAGATCAT 

GACAGTGAAGATTACAAATCAAGCAAGAGCAGAGGGAGATTCAAGTGTGAGACTTGTGGT 

AAAGTGTTTAAATCGTATCAAGCATTAGGAGGACACAGAGCAAGCCACAAGAAGAAC^^ 

GCATGCATGACGAAAACAGAGCAAGTTGAAACAGAGTACGTTCTTGGAGTAAAGGAGAAG 

AAAGTTCATGAATGTCCGATCTGTTTTAGGGTTTTTACTTCAGGGCAAGCACTTGGAGGT 

CATAAGAGATCTCACGGAAGTAACATCGGAGCAGGAAGAGGATTGTCAGTAAGTCAAATT 

GTCCAAATCGAAGAAGAAGTATCAGTGAAACAGAGGATGATTGATCTTAATCTTCCTGCA 

CCTAATGAAGAAGATGAAACTTCTTTGGTGTTTGATGAATGGTGA 

>G1914 Amino Acid Sequence (domain in AA coordinates : 195-216, 245-266) 

MERYKCRFCFKSFINGRAIiGGHMRSHMLTLSAERCVITGEAEEEVEERPSQLCDDDDDTE 

SDASSSSGEFDNQKMNRLDDELEFDFAEDDDVESETESSRINPTRRRSKRTRKLGSFDFD 

FEKLTTSQPS EliVAE PEHHS S ASDTTTEEDIiAFCL IMLSRDKWKQQKKKKQRVEEDETDH 

DSEDYKSSKSRGRFKCETCGKVFKSYQAIiGGHRASHKKNKACMTKTEQVETEY 

KVHECPICFRVFTSGQALGGHKRSHGSNIGAGRGLSVSQIVQIEEEVSVKQRMIDLNLPA 

PNEEDETSLVFDEW* 
>G1925 (1..945) 

ATGGAAGAAAATCTTCCTCCGGGGTTCAGATTTCATCCTACAGACGAGGAGCTCATAACG 
CATTATCTATGTCGGAAAGTCTCCGATATAGGATTC^CCGGTAAAGCTGTCGTCGACGOT 
GATCTCAACAAGTGTGAACCTTGGGATTTGCCAGCCAA 

TGGTATTTCTTCAGCCAAAGGGATCGGAAATATCCAACCGGTTTAAGAACAAACCGGGC^ 
ACAGAAGCTGGTTACTGGAAAACCACCGGGAAAGATAAAGAAATATACCGAAGTGGAGTG 
TTGGTTGGGATGAAGAAAACCCTAGTTTTCTAGAAAGGAAGAGCTCCCAAAGGTGAGAAA 
AGCAATTGGGl^ATGCATGAGTACAGGCTTGAGAGCAAACAACCTTTCAACCCCACGAAT 
AAGGAGGAATGGGTAGTGTGTAGGGTTTTCGAAAAGAGCACGGCAGC^^AGAAAGCACAA 
GAACAACAACCTCAATCTTCTCAACCATCIT^ 

ATGGCAAATGAGTTTGAAGATATTGATGAGCTTCCGAATCTGAATTCAAACTCATCAACC 

ATCGATTACAATAATCATATCCATCAATATTCGCAACGCAATGTTTACTCAGA 

ACAACAAGTACGGCTGGTCTCAACATGAACATGAACATGGCTAGTACTAATCTTCAGTCT 

TGGACAACAAGTCTCCTTGGTCCGCCTTTATCTCC^ATCAACTCTTTGTTGCTCAAGGCT 

TTCCAAATCAGGAACTCTTATAGTTTCCCAA^GAGATGATCCCCAGTTTCAATCA 

TCTCTTCAACAAGGAGTCTCCAATATGATCCAAAAT^ 

CCCCAACCGCAAGAGGAAGCGTTTAATATGGACTCCATATGGTGA 

>G1925 Amino Acid Sequence (conserved domain in AA coordinates : 6-15 0) 
MEKNLPPGFRFHPTDEELITHYLCRK^ 

WYFFS QRDRKY PTGLRTNRATEAGYWKTTGKDKEI YRS G vXiVGMKKTLVF YKGRAPKGEK 
SNWVMHEYRLESKQPFNPTNKEEWWCRVFEKSTAAKKAQEQQPQSSQPSFGSPCDANS . 
MANEFEDIDELPNLNSNSSTIDYKGtraiHQYSQRNVYSEDl^ 

WTTSLLGPPLSPINSIiIjIjKAFQIRNSYSFPKEMIPSFNHSSLQQGVSNMIQNASSSSQVQ 
PQPQEEAFNMDS IW* 
>G1929 (1..366) 

ATGTGTAGAGGCTTSAATAATGAAGAGAGCAGAAGAAGTGACGGAGGAGGTTGCCGGAGT 
CTCTGCACGAGACCGAGTGTTCCGGTAAGGTGTGAGCTTTGCGACGGAGACGCCTCCGTG 
TTCTGTGAAGCGGACTCGGCGTTCCTCTGTAGAAAATGTGACCGGTGGGTTCATGGAGCG 
AATTTTCTAGCTTGGAGACACGTAAGGCGCGTGCTATGCACTTCTTGTCAGAAACTCACG 
CGCCGGTGCCTCGTCGGAGATCATGACTTCCACGTTGTTTTACCGTCGGTG ACGA CGGTC 
GGAGAAACCACCGTGGAGAATAGAAGTGAACAAGATAATCATGAGGTTCCGTTTGTTTTT 

CTCTGA 

>G1929 Amino Acid Sequence (domain in AA coordinates : 31-53) 
MCRGLNNKESRRSDGGGCRSLCTRPSVPWCELCDG 

NFLJVWRHVRJR VX.CTS CQKX.TRRCLVGDHD FHVAHj PS VTTVGETTVENRS EQDNHE VP FVF 
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L* 

>G1930 (76.. 1077) 

ATTCACATTACTAATCTCTCAAGATTTCACAATTTTCTTGTGATTTTCTCTCAGTTTCTT 
ATTTCGTTTCATAACATGGATGCCATGAGTAGCGTAGACGAGAGCTCTACAACTACAGAT 
TCCATTCCGGCGAGAAAGTCATCGTCTCCGGCGAGTTTACTATATAGAATGGGAAGCGGA 
ACAAGCGTGGTACTTGATTCAGAGAACGGTGTCGAAGTCGAAGTCGAAGCCGAATCAAGA 
AAGCTTCCTTCTTCAAGATTCAAAGGTGTTGTTCCTCAACCAAATGGAAGATGGGGAGCT 
CAGATTTACGAGAAACATCAACGCGTGTGGCTTGGTACTTTCAACGAGGAAGACGAAGCA 
GCTCGTGCTTACGACGTCGCGGCTCACCGTTTCCGTGGCCGCGATGCCGTTACTAATTTC 
■ AAAGACACGACGTTCGAAGAAGAGGTTGAGTTCTTAAACGCGCATTCGAAATCAGAGATC 
GTAGATATGTTGAGAAAACACACTTACAAAGAAGAGTTAGACCAAAGGAAACGTAACCGT 
GACGGTAACGGAAAAGAGACGACGGCGTTTGCTTTGGCTTCGATGGTGGTTATGACGGGG 
TTTAAAACGGCGGAGTTACTGTTTGAGAAAACGGTAACGCCAAGTGACGTCGGGAAACTA 
AACCGTTTAGTTATAC CAAAACAC CAAGCGGAGAAAC ATTTTCCGTTAC CGTTAGGTAAT 
AATAACGTCTCCGTTAAAGGTATGCTGTTGAATTTCGAAGACGTTAACGGGAAAGTGTGG 
AGGTTCCGTTACTCTTATTGGAATAGTAGTCAAAGTTATGTGTTGACCAAAGGTTGGAGT 
AGATTCGTTAAAGAGAAGAGACTTTGTGCTGGTGATTTGATCAGTTTTAAAAGATCCAAC 
GATCAAGATCAAAAATTCTTTATCGGGTGGAAATCGAAATCCGGGTTGGATCTAGAGACG 
GGTCGGGTTATGAGATTGTTTGGGGTTGATATTTCTTTAAACGCCGTCGTTGTAGTGAAG 
GAAACAACGGAGGTGTTAATGTCGTCGTTAAGGTGTAAGAAGCAACGAGTTTTGTAATAA 
CAATTTAACAACTTGGGAAAGAAAAAAAAGCTTTTTGATTTTAATTTCTCTTCAACGTTA 

ATCTTGCTGAGATTA 

>G1930 Amino Acid Sequence (domain in AA coordinates: 59-124) 

MDAMSSVDESSTTTDSIPARKSSSPASLLYRMGSGTSVVLDSENGVEVEV^ 

RFKGVVPQPNGRWGAQIYEKHQRVWLGTFNEEDEA 

EEEVEFLJtfAHSKSEIVDMLRKHTYKEEL^^ 

LLFEKXVTPSDVGKLNRLVIPKHQA^ 

YWNSSQSYVLTKGWSRFVKEKRLCAGDLISFKRSNDQDQKFFIGWKSKSGI^ 
LFGVDI SI^AVVVVKETTEVIiMSSIiRCKKQRVL* 
>G195 (51.. 1031) 
TTTTCTTTTTCTTTCTTTTTC 

AAATCAAAGATCTTAACAACTATCACTACACTTCATCGTATAATCATTACAATATCAACA 
ACCAAAATATGATTAATCTCCCTTACGTTTCTGGTCCATCTGCTTATAATGCAAACATGA 

TCTCATCATCACAAGTAGGTTTTGA^ 

TCGAGTTGGGTTTCGAGCTTTCTCCATCTTCTTCTGACTTTTTTAATCCTTCCCTCGATC 

AAGAGAACGGTTTGTATAATGCTTATAATTATAATAGTAGTCAAAAGAGTCATGAAGTTG 

TCGGTGATGGTTGTGCAACCATTAAGAGTGAAGTTAGGGTTTCAGCATCTCCTTCTTCAA 

GTGAGGCCGATCATCATCCAGGAGAAGATTCCGGCAAGATCCGGAAGAAAAGAGAAGTTC 

GCGATGGAGGAGAAGATGATCAACGCTCTCAGAAAGTAGTTAAAACAAAGAAGAAAGAGG 

AGAAGAAAAAAGAGCCACGAGTCTCGTTCATGACTAAGACCGAAGTTGATCATCTCGAAG 

ACGGCTATCGTTGGAGAAAGTATGGCCAAAAAGCAGTCAAAAACAGTCCTTATCCGAGGA 

GTTACTATAGATGCACGACTCAGAAGTGCAACGTGAAGAAGAGAGTGGAGAGATCTTACC 

AAGACCGAACGGTCGTCATCACAACCTACGAGAGTGAACACAACCATCCGATCCCGACCA 

ATCGTCGGACAGCAATGTTCTCTGGAACCACCGCATCTGATTATAACCCATCATCGTCTC 

CAATATTCTCCGATCTCATCATCAATACTCCAAGAAGCTTCTCAAATGATGATCTCTTCC 

GTGTGCCATACGCTAGTGTGAACGTGAACCCTAGTTATCATCAACAGCAACATGGATTTC 

ATCAAC^GGAGAGTGAGTTCGAGCTCTTGAAGGAGATGTTTCCTTCGGTTTTCTTCAAAC 

AAGAGCCTTGATGA5ATAATATAATATAGAAACAATTTTTTTTCTGCTAAGAAATATAGA 

ACAAAACTTGGATGCATAATAAGTGATGATAGTGTTATTTATTTTTTGCATGTATATATT 

ATACATGTTTTGTTAACTAGCTATAGGATATACTGGTAGTAATTAAGCATAAATATGGAG 

CCCTTCGACTTATTACAATAATTTTTGGT^^ 

NNNTTNNGG 

>G195 Amino Acid Sequence (domain in AA coordinates: 183-239) 
MSHEIKDLNNYHYTSSYlsTHY 

QGAFELGFELSPSSSDFFNPSLDQENGLYNAYNYNSSQKSHEVVGDGCATIKSEVRVSAS 

PSSSEADHHPGEDSGKIRKKREVRDGGEDDQRSQKWKTKKKEEKKKBPRVSFM^ 

HLEDGYRWRKYGQKAVKNSPYPRSYYRCTTQKO^ 
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IPTNRRTAMFSGTTASDYNPSSSPIFSDLIINTPRSFSNDDLFRVPYASVNVNPSYHQQQ 
HGFHQQESEFEIjLKEMFPS VFFKQEP * 
>G1954 (196. .1440) 

ATTTATGACTTCTCAATACAAAAAGCTCCCCTC^ 

CCGTCTTCTTCTACTATCTTGCATGTCTTGCGTCTTTTATATACATCTCTCGTAAACCCT 
AGCAAATCATACAAGGTCAAGAAGCTTGACCTTCATTAGACTTAAGCAGTTTATAATCAA 
CTACGACGAATAGCAATGGATAAAGATTACTCGGGACCAAACTTCTTAGGTGAATCCTCA 
GGCGGTAACGATGATAACAGCTCTGGTATGATAGACTATATGTTCAATAGAAACCTTCAA 
CAACAACAAAAGCAATCGATGCCAGAACAGCAGCAACATC^ 

GGAGCAACACCCTTTGATAAAATGAACTTCTCTGATGTGATGCAGTTTGCGGACTTCGGT 

TCGAAACTTGCGTTGAACCAGACCAGAAACCAAGACGATCAAGAAACCGGGATTGACCCC 

GTTTATTTCTTGAAGTTCCCTGTCTTGAACGACAAAATAGAGGACCATAACCAAACCCAA 

CATCTCATGCCTTCTCATCAGACGTCTCAAGAAGGAGGTGAGTGTGGAGGAAACATAGGC 

AATGTGTTTCTTGAAGAAAAAGAAGATCAAGACGATGACAACGACAACAACTCCGTGCAA 

CTACGTTTTATTGGAGGAGAAGAAGAAGATAGGGAGAACAAGAATGTTACGAAAAAGGAG 

GTGAAGAG CAAGAGGAAGAGAGCTAGAACGAGCAAGAC C AGCGAAGAAGTGG AAAGCCAA 

CGGATGACTCATATCGCGGTCGAAAGAAACCGTAGGAAGCAAATGAATGAGCATCTTCGT 

GTCCTTAGATCTCTC^TGCCTGGCTCCTACGTTCAAAGGGGAGACCAAGCGTCAATCATA 

GGAGGAGCAATAGAGTTTGTGAGAGAGCTCGAGCAACTCCTACAATGTCTTGAATCACAG 

AAGCGTCGAAGAATCTTAGGAGAAACCGGTAGGGACATGACAACGACAACGACTTCTTCT 

TCTTCTCCCATAACTACGGTAGCGAACCAAGCACAACCGCTCATTATTACGGGAAATGTA 

ACCGAGCTAGAGGGCGGAGGAGGGCTTCGGGAGGAGACTGCGGAGAACAAGTCGTGCTTG 

GCTGACGTGGAGGTGAAGCTGCTAGGGTTTGACGCCATGATCAAGATACTTTCAAG 

AGGCCGGGACAGCTGATTAAGACTATAGCTGCTTTGGAGGATCTTCATCTCTCTATTCTT 

GACACTAAGATCACTACCATGGAACAAACCGTCCT 

AGTGAAACGAGGTTTACGGC^GAAGACATAGCAAGTTCCATCCAACAGATATTTAGTTTC 
ATTGATGCAAATACGAAC^TATCTGGAAGCTCTAAC^ 

AAATCATCACACGGCGACAACTTTGTACACTGGTGAAGATTACAGTACGTAATAATCTCT 

ACATATTGGGTTTTATTCTCCAAGCATTTGGAAGAGTGTTTAAGTTAAAGGGAGTGCTTA 

CTTTATTTTTTTGGGGCTTTTTTCATGCAATTTAAATTTTAGTGATGATTGTGTCGCTTG 

TAATGTTAGAACTCGTTGTTGTGATTTCTGCTGCTTTGATTTGTAGGTTTTGAACAAGCG 

GTTTAGAATGCTAAACCACTTATTTACTTGA 

AAGAAAAAAA 

>G1954 Amino Acid Sequence (domain in AA coordinates : 187-259) 
MDKDYSAPNFXjGESSGGNDDNSSGMIDYMFNRNIiQQQQKQSMPQQQQHQLSPSGFGATPF 
DKMNFSDVMQFADFGSKLALNQTRNQDDQETGIDPVYFLKFPVIiNDKIEDHNQTQHIiMPS 
HQTSQEGGE CGGNIGNVFIjEEKEDQDDDNDIsINSVQLRF I GGEEEDRENKNVTKKEVKSKR 
KRARTS KTS EEVES QRMTHI AVERNRRKQMNEHLRVLRSLMPGS YVQRGDQAS 1 1 GGAI E 
FVRELEQLLQCLES QKRRRI LGETGRDMTTTTTS S S S P I TTVAMQ AQPL 1 1 TGNVTELEG 
GGGLREETAENKSCIxADVEVKLLGFDAMIKILSRRRPGQLIKTIAAIiEDLHLSILHTNIT 
TMEQTVLYSFNVKITSETRFTAEDIASSIQQIFSFIHANTNISGSSNLGNIVFT* 
>G1958 (107.. 1336) 

GTACCGTCGACCGATTATCCCCAAGAGGAGAATCCTCATAATCATTTTCTCCGATTCGAT 
TCGTCTTCCTTGGTCCTGGATTGCTTCATGAATTTCTAGGACAACAATGGAGGCTCGTCC 
AGTTCATAGATCAGGTTCGAGAGACCTCACACGCACTTCTTCAATCCCATCTACACAAAA 
ACCTTC^CCAGTAGAAGATAGTTTCATGAGATCAGATAACAACAGTCIAGTTAATGTCTAG 
ACCATTAGGACAAACCTACCATTTACTTTCATCTAGTAACGGTGGAGCTGTTGGACATAT 
ATGTTCTTCTTCATeATCTCGTTT^ 

TGAGAAACAACAACACTACACAGGAAG CAG CAGTAATAATG CTGTGCAGAC ACC AAGCAA 
CAACGATAGTGCTTGGTGTCATGATTGATTGCCAGGAGGGOT 

CAACCCGGCGATTCAAAACAACTGTCAGATTGAGGATGGTGGCATTGCGGCTGCTTTTGA 
TGACATTCAAAAACGAAGTGATTGGCATGAATGGGCTGACCATTTGATCACTGATGATGA 
TCCTTTGATGTCTACTAACTGGAATGATCTCTTGCTTGAAACAAATTCCAATTCAGATTC 
AAAGGACCAGAAGACACTGCAAATTCCGCAACCTCAGATTC 

GTCTGTGGAATTGCGACCTGTTAGCACAACATCTTCAAACAGCAATAACGGAACGGGCAA 
GGCACGAATGCGTTGGACGCCAGAGCTTCACGAGGCTTTTGTTGAGGCTGTCAACAGTCT 
TGGCGGTAGTGAAAGAGCTACTCCTAAAGGGGTACTGAAGATTATGAAAGTTGAAGGCTT 
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GACTATATATCATGTTAAAAGCCATTTACAGAAATATAGGACAGCTAGATATCGGCCAGA 

ACCATCAGAAACTGGTTCGCC^GAAAGGAAGTTGACACCGCTTGAACATATAACATCTCT 

TGATTTGAAAGGTGGGATAGGTATTACAGAGGCTCTACGACTTCAGATGGAAGTACAGAA 

GCAACTCCATGAGCAGCTCGAGATTCAAAGAAACCTGCAACTCCGAATAGAAGAACAAGG 

CAAGTACCTGCAAATGATGTTCGAGAAGCAAAACTCTGGTCTTACCAAAGGGACAGCCTC 

AACATCAGATTCCGCAGCCAAATCTGAACAAGAAGACAAGAAGACTGCTGATTCGAAGGA 

GGTTCCAGAAGAAGAAACCAGGAAATGTGAGGAACTAGAATCTCCACAGCCAAAGCGTCC 

CAAAATCGATAATTGAAAGTATTGGTCTTTTGCTGGATAATCTCGGAGTTTCAGAGTTAA 

CAGTGATAGAGAGAACGAGCTCTTATCTTGAGGTTCTTCAGGACTTCTrCTCGCGGCCG 

CTAG 

>G1958 Amino Acid Sequence (domain in AA coordinates: 230-278) 
MEARPVHRSGSI^LTRTSSIPSTQKPSPVEDSFI^SDNNSQLMSRPLGQTYHIjLSSSNGG 
AVGHI CS S S S S GFATNLHYSTMVSHEKQQHYTGS S SNNAVQTP SNNDSAWCHDSIjPGGFIj 

dfhetnpaiqnncqiedggiaaafddiqkrsdwhewadh^ 

SNSDSKDQKTIiQIPQPQIVQQQPSPSVEIiRPVSTTSSNSNNGTGKARMRWTPELHEAFVE 
AVNSLGGSERATPKGVIjKIMKVEGLTIYHVKSHIjQKYRTARYRPEPSETGSPERKLTPLE 
HITSLDLKGG I GITEALRLQMEVQKQIiHEQIiE I QROTjQLRIEEQGKYLQMMFEKQNSGLT 
KGTASTSDSAAKSEQEDKKTADSKEVPEEETRKCEEIjESPQPKRPKIDN* 

>G196 (111- .1421) 
TCGACATCAGATTTCTCTC^CGGATTCC^ 

ATTCTTCCCGTGTATAAATCTCATATAAACACGCATCATACATATATATTATGTGCAGCG 
TCTTTGAGTTTCAAGACATGGACAACTTCCAAGGAGATCTAACAGACGTCGTACGAGGAA 
TAGGATCAGGCCACGTGTCACCATCTCCTGGACCACCGGAAGGTCCATCTCCGAGCAGCA 
TGTCTCCGCCGCCAACAT(^GATCTCCACGTGGAATTCCCCTCCGCCGCTACTTCTGCCA 
GCTGTCTCGCAAATCCCTTCGGAGACCCGTTCGTAAGCATGAAGGATCCTCTCATCCACC 
TCCCGGCCAGCTACATCTCCGGCGCCGGTGATAATAAAAGCAACAAAAGTTTTGCAATCT 
TTCGAAAGATTTTTGAGGATGATCATATTAAGAGTCAATGC^GTGTCTTCCCAAGAATTA 
AGATCTCGCAAAGTAAGZUVTATCGACGA 

TCTCCTCTGCCGCCGTAGCAGCTTCGCCGTGGGGCATGATCAACGTTAATACCACTAACA 
GTCCAAGAAACTGTTTACTTGTCGATAATAATAACAACACGTCATCATGCTCACAGGTrC 
AGATCTCTTCTTCCCCTCGGAATCTCGGAATTAAGAGAAGGAAGAGCCAGGCAAAGAAAG 
TGGTGTGCATACCGGCTCCAGCCGCTATGAACAGCCGGTCCAGTGGAGAAGTTGTTCCGT 
CTGATCTATGGGCTTGGCGAAAGTACGGTCAAAAACCTATCAAAGGTTCTCCTTATCCAA 
GGGGTTACTACAGATGTAGCAGCTCAAAAGGTTGTTCAGCTAGGAAACAAGTCGAACGTA 
GCCGC^CTGATCCAAAC^TGTTAGTCATTACTTACACCTCTGAGCmTAACCACCCATGGC 
CTACTCAACGCAACGCTCTCGCAGGTTCCACTCGTTCCTCTTCCTCCTCCTCTTTAAACC 
CTTCTTCCAAATCCTCAACCGCAGCCGCCACTACTTCTCCCTC^TCGAGAGTTTTCCAAA 
ACAACAGCAGCAAAGACGAACCCAATAACTCCAACTTGCCTTCCTCTTCCACTCATCCTC 
CTTTTGACGCCGCCGCAATTAAGGAGGAGAACGTGGAAGAGCGTCAGGAAT^AGATGGAGT 
TCGATTATAATGACGTTGAAAATACCTATAGACCGGAGTTGTTGCAAGAGTTTCAACATC 
AGCCGGAGGATTTCTTTGCCGATCTCGACGAGCTTGAGGGAGATTCTTTGACTATGTTGC 
TCTCTGACAGTAGCGGCGGAGGCAACATGGAAAACAAAACGACGATTCCAGACGTTTTTA 
GTGATTTCTTTGACGACGACGAGTCCTCAAGGTCGTTATAAATATTGTTGTTAATGTATA 
CATAGAAATGAAATTATTCATGTAATTCGTTTTGTGTTAAATGACGGTATTTGCCTTTGC 
A 

>G196 Amino Acid Sequence (conserved domain in AA coordinates : 223-283) 

MCSVFEFQDMDNFQGDIiTDVVRGIGSGHVSPSPGPPEGPSPSSMSPPPTSDIiHVEFPSAA 

TSASCLANPFGDPFVSMIODPIiIHLPASYISGAGDNKSNKSFAIFPKIFEDDHIKSQCSVF 

PR I KI SQSNNIHD ASTCNS PAI TVS SAAVAAS P WGM INVNTTNS PRNCLLVDNNNNTS S C 

SQVQISSSPRBH^IKRRKSQAKKWCIPAPAAM 

P YPRGYYRCS S S KGCSARKQVERSRTDPNMLVITYTSEHNHPWPTQRNALAGSTRS S S S S 

SIjNPSSKSSTAAATTSPSSRVFQNNSSKDEPNNSNLPSSSTHPPFDAAAIKEENVEERQE 

KMEFDYXHJVENTYRPEIiLQBFQHQPEDFFAD^ 

DVFSDFFDDDESSRSL* 

>G1965 (1..609) 

ATGGATAACTTCAATGTTGTTGCCAATGAAGACAATCAAGTGAATGATGTGAAGCCTCCA 
CCACCCCCACCGCGAGTGTGTGCAAGATGTGATTCTGATAACACAAAATTTTGTTACTAC 
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AACAATTATAGTGAGTTTCAACCGCGCTACTTCTGCAAGAACTGTCGAAGATACTGGACT 
CATGGTGGGGCTTTAAGAAACGTACCAATTGGTGGGAGTAGTCGTGCCAAGCGGACAAGG 
ATAAATCAACCTTCAGTTGCTCAGATGGTTTCTGTTGGAATCCAACCAGGGAACCGTTTT 
AGTTCTTTGTCTCATATTCATGGTGGTATGGTAACAAATGTGCATCCAACTCAAACTTT^ 
CGACCAAATCATCGCCTAGCTTTCCATAATGGATCATTTGAGCAAGATTATTATGATGTT 
GGGTCTGATAATCTTTTGGTAAACCAACAAGTTGGTGGATATGTTGATAATCACAACGGT 
TATCACATGAATCAAGTGGATC^TACAACTGGAACCAGAGCTTCAATAACGCTATGAAC 
ATGAATTATAATAACGCTAGCACTAGCGGAAGGATGCATCCTAGTCATTTAGAGAAGGGT 

GGTCCTTGA 

>G1965 Amino Acid Sequence (domain in AA coordinates : 27-55) 
MDNFNWANEDNQVNDVl^ 

HGGALRNVPIGGSSRAKRTRINQPSVAQMVSVGIQPGNR^ 

RPNHRIiAFHNGS FEQD YYDVGSDOTjIiVNQQVGGYVDNHNGYHMNQVDQYNWNQS FNNAMN 

MNYNNASTSGRMHPSHLEKGGP* 

>G1976 (1..1152) 

ATGACTGATCCTTATTCCAATTTCTTCACAGACTGGTTCAAGTCTAATCCTTTTCACCAT 
TACCCTAATTCCTCCACTAACCCCTCTCCTCATCCTCTTCCTCCTGTTACTCCTCCCTCT 
TCCTTCTTCTTCTTCCCTC^UVTC 

CCTCCTTCTCCTCCTCTCTOAGAAGCCCTCCCTOTCCTCAGCCTCAGCCCCGCC^C^AA 
CAACAAGACCACCATCAGAACGATGACC 

GATGTCGACTACGATC^TCACCATCAAGATGATCATCATAACCTCGATGACGATGACCAT 

GACGTCACCGTTGCTCTTCACATAGGCCTTCCAAGCCCTAGTGCTCAAGAGATGGCCTCT 

TTGOTCATGATGTCTTCTTCTTCCTCTTCCTCGAGGAC(^CTCATCATCACGAGGACATG 

AATCACAAGAAAGACCTCGACCATGAGTACAGCCACGGAGCTGTCGGAGGAGGAGAAGAT 

GACGATGAAGATTCAGTCGGCGGAGACGGCGGCTGTAGAATCAGCAGACTCAACAAGGGT 

CAATATTGGATCCCTACACCTTCTCAGATTCTCATTGGCCCTACTCAGTTCTCATGTCCT 

GTTTGCTTCAAAACCTTCAACAGATACAATAACATGCAGATGCATATGTGGGG 

TCACAATACAGAAAAGGACCTGAATCTCTAAGGGGAACACAACCAACAGGAATGCTAAGG 

CTTCCGTGCTATTGCTGCGCCCCAGGCTGTCGCAACAACATTGACCATCCAAGGGCA 

CCTCTCAAAGACTTCAGAACCCTTCAAACACATTACAAGAGAAAAC7VTG 

TTCATGTGTAGGAAATGTGGAAAGGCTTTCGCAGTCCGAGGGGACTGGAGAACACATGAG 

AAGAATTGTGGCAAACTTTGGTATTGCATATGTGGATCTGATTTCAAGCACAAGAGATCT 

CTCAAAGATCACATCAAGGCTTTTGGGAATGGTCATGGAGCCTACGGAATTGATGGGTTT 

GATGAAGAAGATGAGCCTGCCTCTGAGGTAGAACAATTAGACAATGATCATGAGTCAATG 

CAGTCTAAATAG 

>G1976 Amino Acid Sequence (domain in AA coordinates: 219-323) 
MTDPYSNFFTDWFKSNPFHHYPNSSTNPSPHPLPPVTPPSSFFFFPQSGDIiRRPPPPPTP 
PPS PPLREALPIjIjS LS PANKQQDHHHNHDHL I QEPP STSMDVDYDHHHQDDHHNLDDDDH 
DVTVALHIGIiPS PS AQEMAS LLMMS S S S S S SRTTHHHEDMNHKKDLDHE YSHGAVGGGED 
DDEDSVGGDGGCRISRLNKGQYWIPTPSQILIGPTQFSCPVCFKTFNRYlsnS^ 
SQYRKGPESLRGTQPTGMIJUjPCYCCAPGCRN^ 

FMC^KCGKAFAVRGDWRTHEKNCGKIiWYCICGSDFKHKRSLKDHIKAFGNGHGAYGIDGF 
DEEDE PAS EVEQLDNDHESMQS K* 
>G2057 (27.. 1289) 

GCCGTCTCGACGAATATGCTCTACCAATGTCTGACGACCAATTCCATCACCCGCCGCCTC 
CTTCTTCAATGAGGCACCGTTCTACGTCGGATGCGGCGGACGGCGGCTGCGGCGAGATTG 
TTGAGGTGCAAGGTGGTGAGATTGTTCGGTCTACCGGAAGAAAAGACCGCCACAGCAAAG 
TCTGCACGGCTAAAGGGCCACGTGACCGGCGCGTGAGACTCTCTGCTCACACGGCGATTC 
AGTTTTACGATGTTCAAGACAGGCTTGGTTTCGACCGACCTAGCAAAGCCGTTGATTGGC 
TTATCAAAAAGGCTAAGACTTCCATTGACGAGCTCGCTGAGCTTCCTCCCTGGAATCCCG 
CCGATGCAATTCGCCTAGCCGCTGCTAACGCTAAACCCAGAAGAACCACCGCCAAAACCC 
AAATCTCTCCGTCTCCGCC^CCGCCGCAACAGC^ 

GTGTTGGCTTCAACGGAGGAGGAGCAGAGCATCCGAGTAACAACGAGTCGAGTTTTCTCC 

CGCCGTCAATGGATTCAGATTCGATAGCTGACACTATAAAGTCGTTTTTTCCGGTGA 

GCTCTTCAACGGAGGCTCCTTCGAATCATAACCTTATGCACAACTATCATCATCAGCATC 

CGCCGGATTTGCTTTCTCGAACTAATAGCCAAAACCAAGATCTCCGTCTCTCGCTGCAAT 

CGTTCCCGGATGGTCCACCGTCGCTTCTGCACCACCAACATCACCACCACACCTCTGCTT 
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CCGCCTCCGAGCCTACTCTGTTCTACGGACAGAGCAATCCGTTAGGGTTTGACACATCGA 
GTTGGGAGCAGCAGTCGTCGGAATTCGGAAGGATTCAGAGACTAGTGGCTTGGAACAGCG 
GCGGTGGCGGCGGAGCAACCGATACAGGAAACGGAGGAGGGTTTCTGTTCGCTCCTCCTA 
CTCCTTCAACGACGTCGTTTC^GCCAGTTOTTGGCCAAAGCCTU^CAGCTTTATTCTCAGA 
GGGGTCCCCTTCAGTCCAGTTACAGTCCCATGATCCGTGCTTGGTTTGATCCTCACCATC 
ATCACCAATCC^TCTCCACCGACGATCTC^CCACCACCATCACCTTCCTCCACCGGTTC 
ACCAATCAGCAATCCCCGGAATCGGATTCGCCTCAG 

TACCAGCACGGTTTCAGGGCCAAGAAGAGGAGCAGCACGACGGTCTCACTCACAAGCCGT 

CCTCTGCTTCCTCTATTTCTCGCCATTGACAATCGAAACTAATCCTC 

>G2057 Amino Acid Sequence (domain in AA coordinates: TBD) 

MSDDQFHHPPPPSSMRHRSTSDAAIXBGCGEIVEVQGGHira^ 

RRVRX> S AHTAI QF YDVQDRLG FDRPS KAVDWL I KKAKTS IDELAELP PWNPAD AIRLAAA 
NAKPRRTTAKTQISPSPPPPQQQQQQQQLQFGVGFNGGGAEHPSNNESSFLPPSMDSDSI 
ADTlKSFFPVIGSSTEAPSNHNLMHirafflQHPPDIiliSRTO 

LHHQHHHHTSASASEPTLFYGQSNPLGFDTSSWEQQSSEFGRIQRIiVAWNSGGGGGATDT 

GNGGGFIiFAPPTPSTTS FQP VLGQS QQL YSQRGPLQS SYS PMIRAWFDPHHHHQS I STDD 
IjNHHHHIjPPPVHQSAIPGIGFASGEFSSGFRIPARFQGQEEEQHDGLTHKPSSASSISRH 

* 

>G2107 (79. .624) 
ACC^CAAAAC^GAGCAACACACAAC^^ 

TTGAGAACCAGATCGGAGATGGAAAACGACGATATCACCGTGGCGGAGATGAAGCCAAAG 
AAGCGTGCTGGACGGAGGATTTTCAAGGAGACACGTCACCCAATCTACAGAGGCGTGCGG 
CGTAGGGACGGCGACAAATGGGTATGCGAAGTCCGTGAACCGATTCATCAGCGTCGAGTC 
TGGCTCGGAACTTATCCGACGGCAGATATGGCCGCACGTGCTCACGACGTGGCGGTTCTT 
GCTCTGCGCGGGAGATCCGCGTGTTTGAATTTCTCCGATTCTGCTTGGAGGTTGCCGGTG 
CCGGCATCCACTGATCCGGACACGATCAGGCGCACGGCGGCCGAAGCAGCGGAGATGTTC 
AGGCCGCCGGAGTTTAGTACAGGAATTACGGTTTTACCCTCAGCCAGTGAGTTTGACACG 
TCGGATGAAGGAGTCGCTGGAATGATGATGAGGCTCGCGGAGGAGCCGTTGATGTCGCCG 
CCAAGATCGTACATTGATATGAATACGAGTGTGTACGTGGACGAAGAAATGTGTTACGAA 
GATTTGTCACTTTGGAGTTACTAAAATACGTATGTGTTAAAAAACCAAAGATCGTATGTG 
TATGTATGCATAATAAATGGGCTTAATGATGGGCATAGATATGATAGGTCCAGCCTATAT 
GTTAAATGTGTTTTATTTTTTGGTTTATCTAGTTTCCTAGGTATTTACCAAATTGTATTA 
GTATAAGTTTTATTAAGAAATAATCAAAAATGTTGTTGCCAAAAAAAAAAAAAAAAAAA 

AAAAA 

>G2107 Amino Acid Sequence (domain in AA coordinates: TBD) 
MENDDITVAEMKPKKRAGRRIFKETR^ 

TADMAARAHDVAVLALRGRSACI^SDSAWRLPVPASTDPDTIRRTAAEAA^ 

TGI TVLPS AS EFDTSDEGVAGMMMRLAEEPLMS PPRS Y IDMNTS VYVDEEMC YEDLS LWS 

Y* 

>G211 (1..750) 

ATGATGTCATGTGGTGGGAAGAAGCCAGTGTCTAAGAAAACAACGCCGTGTTGCACGAAG 
ATGGGGATGAAGAGAGGACCATGGACGGTGGAGGAAGACGAGATTCTTGTGAGCTTCATT 
AAGAAAGAAGGTGAAGGACGGTGGCGATCGCTTCCTAAGAGAGCTGGTTTACTCAGATGT 
GGAAAGAGCTGTCGTCTACGGTGGATGAACTATCTCCGACCCTCGGTTAAACGTGGAGGA 
ATTACGTCGGACGAGGAAGATCTCATCCTCCGTCTTCACCGCCTCCTCGGCAACAGGTGG 
TCATTGATCGCGGGAAGGATACCGGGAAGGACTGATAATGAAATTAAGAACTATTGGAAC 
ACTCATCTTCGTAAGAAACTTTTAAGGCAAGGAATTGATCCTCAAACCCACAAGCCTCTT 
GATGCAAACAAC AT€ CAT AAACCAG AAGAAGAAGTTTC CGGTGGACAAAAGT AC CCTCTA 
GAGCCTATTTCTAGTTCTCATACTGATGATACCACTGTTAATGGCGGGGATGGAGATAGC 
AAGAACAGTATCAATGTCrTTTGGTGGTGAACACGGCTACGAAGACTTTGGTTTCTGCTAC 
GACGACAAGTTCTCATCGTTTCTTAATTCGCTCATCAACGATGTTGGTGATCCTTTTGGT 
AATATTATCCCAATATCTCAACCTTTGCAGATGGATGATTGTAAGGATGGGATTGTTGGA 
GCGTCGTCTTCTAGCTTAGGACATGACTAG 

>G211 Amino Acid Sequence (conserved domain in AA coordinates : 24-137) 
MMS CGGKKPVS KKTTPCCTKMGMKRGPWTVEEDE ILVS F IKKEGEGRWRSIjPKRAGLIjRC 
GKS dRLRWMNYLRPSVKRGGITSDEEDLILiRIjHRLIjGNRWSIiI AGRI PGRTDNE I KNYWN 
THLRKKLLRQGIDPQTHKPLDANN^ 
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KNSINVFGGEHGYEDFGFCYDDKFSSFIiNSLIlTOVGDPFGNIIPISQPLQMDDCKDGIVG 

ASSSSLGHD* 
>G2133 (26.-457) 

ATCTCATCTTCATCCACCCAAAAACATGGATTCAAGAGACACCGGAGAAACTGACCAGAG 
CAAGTACAAAGGTATCCGTCGTCGGAAATGGGGAAAATGGGTATCAGAGATTCGTGTCCC 
GGGAACTCGTCAACGTCTCTGGTTAGGCTCTTTCTCCACCGCAGAAGGCGCTGCCGTAGC 
CCACGACGTCGCTTTTTACTGCTTGCACCGACCATCTTCCCTCGACGACGAATCTTTTAA 
CTTCCCTCACTTACTTACAACCTCCCTCGCCTCCAATATATCTCCTAAGTCCATCCAAAA 
AGCTGCTTCCGACGCCGGCATGGCCGTGGACGCCGGATTCCATGGTGCTGTGTCTGGGAG 
TGGTGGTTGTGAAGAGAGATCTTCCATGGCGAATATGGAGGAGGAGGACAAACTTAGTAT 
CTCCGTGTATGATTATCTTGAAGACGATCrCGTTTGATCTATACGAGTACGTTTTTAGCA 

GTTAA 

>G2133 Amino Acid Sequence (domain in AA coordinates : 11-83) 
MDSRDTGETDQSKY1CGIRRRKWGKWVSEIRVP 

HRPSSIiDDESFNFPHLLTTSIiASNISPKSIQKAASDAGMAVDAGFHGAVSGSGGCEERSS 

MANMEEEDKLSISVYDYLEDDLV* 

>G2134 (36. .644) 

GAGCAAAAACTTTGTGTGCGTGTGTGTGTGTGTTCATGGCTGGTCTTAGGAATTCCGGTA 
AC^GCGACAAAGCGCAAAACGATGGCAAAGGTGTACCATCTGCCTACAGAGGAGTCCGGA 
AGAGAAAATGGGGGAAATGGGTGTCTGAAATCCGTGAACCGGGGACCAAGAACGGTATCT 
GGCTAGGCAGTTTCGAGACTCCTGAAATGGCTGCAACCGCATACGACGTGGCAGCATTTC 
ATTTCAGAGGGAGAGAAGCTCGTCTCAACTTCCCTGAGCTCGCCAGCAGCCTTCCACGTC 
CTGCAGACTCTAGCTCAGACAGCATTCGCATGGCAGTTCATGAGGCAACACTCTGCCGCA 
CC^CCGAAGGAACAGAGTCAGCC^TGCAAGTGGAC^GCTCAAGCTCCTCCAATGTAGCTC 
CAACAATGGTCAGAC^CTCGCCCAGGGAAATTC^ 

CTCCTACTACAATGATGCATTCAACATACGACCCTATGGAGTTTGCTAATGATGTGGAGA 

TGAATGCTTGGGAAACATACCAGAGTGACTTTCTTTGGGACCCTTAACCCCAAAACCTAA 

CTCATGGAGAGCTTCTACAGCTCAATCTTACAATACCAGCATAAGTTACTGGCTTAGA^ 

ACTTAAATTTATTGAAGTTTAGTTTTCAGAGTCTACCACAAGGGTTGTTG 

TATAGCAAAGAATAAAGCTCATCAGATTTTGGAGGGAAAGACTCTATGAGCTTGATGGGT 

CCCTGAAAGGACCTCTTCACAAATATTTTTAAATTTTTTTGTTACTAGTAGAAACATAGA 

TTATGAGGTGTGACTTATTATTATTTTTTACAATTGTTTGTTACCTCATTGATGTATTTG 

ATTT 

>G2134 Amino Acid Sequence (domain in AA coordinates: TBD) 
MAGLRNSGNSDKAQ1TOGKGVPSAYRGVRKRKWGKWVSE IREPGTKNRIWLGS FETPEMAA 
TAYDVAAFHFRGIU2ARLNFPELAS S LPRPADS S SDS IRMAVHEATLCRTTEGTE S AMQVD 
SSSSSNVAPTIWRLSPREIQAINESTIiGSPTTMM^ 
WDP * PQNLTHGELIiQLNLT I PA* 
>G2151 (236.. 1321) 

TTTTTTTTTTAGGGTTCATAAGAACAAATTGGATTTTGAGCTCACAGTATAAATAACCCG 
ACTTTGATTACTGGGTAATTTTAAAACCGCCATTGTTGTTCTCTTTACTACTTTTGGGAA 
TTAGGGTTTATGATTTCTGGGTATTAGATTAGATAAATTTGTTTCCTITTTTTTGTTAATC 
AATTTAAAAATCTCTTATTTCTGTTAAAGACTTGTAATTTTGGAGTTTTTAATGCATGGA 
CGGAAGAGAAGCAATGGCATTTCCAGGCTCGCATTCTCAGTACTATCTTCAAAGAGGAGC 
CTTTACTAATCTCGCACCTTCCCAAGT^ 

GGGATTGAGGCCAATGTCTAACCCTAACATTCATCACCCTCAGGCTAACAATCCAGGACC 

TCCTTTCTCGGATTTTGGACACACCATTCACATGGGAGTGGTCTCCTCTGCTTCTGATGC 

TGATGTGCAACCGCCACCGCCACCGCCACCACCAGAGGAACCGATGGTTAAGAGGAAACG 

TGGACGGCCAAGAAAGTATGGAGAACCGATGGTTAGTAATAAGTCTAGGGACTCTTCTCC 

AATGTCTGATCCTAATGAACCTAAACGGGCCAGAGGTCGACCTCCTGGAACTGGAAGGAA 

GCAACGCTTGGCTAATCTTGGTGAGTGGATGAATACTTCAGCTGGA^^ 

TCATGTGATCAGCATTGGAGCAGGAGAAGACATTGCTGCGAAAGTTTTGTCATTTTCACA 

ACAAAGACCTCGGGCTCTTTGTATAATGTCAGGCACTGGAACCATTTCTTCAGTCACTCT 

GTGCAAACC CGGTTCAAC CGATCGTCACTTAACATACGAGGGAC CTTTTGAGATTATAAG 

TTTTGGTGGATCTTATTTGGTGAATGAAGAAGGTGGATCCAGAAGTCGAACAGGCGGATT 

GAGTGTCTCTCTTTCTCGTCCCGATGGTAGTATTATTGCCGGTGGAGTTGACATGCTTAT 

CGCAGCC^CCTTGTTCAGGTGGTGGCATGTAGTTTTGTATACGGAGCAAGGGCAAAGAC 
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TCATAATAAGAATAACAAGAC(^TCAGACAAGAAAAGGAACC^ 

TAGTGAAATGGAGACCACACCGGGTAGTGCAGCTGAACCAGCAGCATCTGCGGGTCAGCA 

GACGCCACAGAACTTCTCTTCTCAGGGAATAAGGGGGTGGCCCGGTTCAGGCTCAGGCTC 

TGGCAGATCACTTGACATTTGCAGAAACCCACTCACTGATTTTGATTTGACTCGTGGATG 

ATATACACTATTAGTCTTTGAAGCAGCAGCATAC^UU^TGTGATTGCTGTACATATGTTA 

TTGTAGATTTCTCTCTGGGAATGTTGAAATCAGACATTTAAGGATTGATACTAGATCTCT 

CAGCTCCTTCTAACATTGTTAATGTAACAGAACCCTCCCACTTTCATGCTATTTGC 

>G2151 Amino Acid Sequence (domain in AA coordinates : 93-113 , 124-144) 

MDGRKAMAFPGSHSQYYIiQRGAFTNIxAPSQVASGLHAPPPHTGIjRPM 

GPPFSDFGHTIHMGVVSSASDADVQPPPPPPPPBEP^WKRKRGRPRKYGEPMVSNKSRDS 

S PMSDPNEPKI^ARGRPPGTGRKQRLANLGEWMNTS AGLAFAPHVI S I GAGED IAAKVLS F 

SQQRPRALCIMSGTGTISSVTLCKPGSTORHL^^ 

GLS VSIiSRPDGS 1 I AGGVDML IAANLVQWACS FVYGARAKTHNNNNKTIRQEKEPNEED 
NNSEMETTPGSAAEPAASAGQQTPQNFSSQGIRGWPGSGSGSGRSIjDICRNPIjTDFDIjTR 

G* 

>G2154 (82.. 1317) 

GCAAAAAGAAAAAATGAAAAAAAATCCCTAACTCTCTCTCTCTAGAAATTCTTATTTTTG 
TGCGTATCTCTCTAAAAAGGAATGGATCCTAACG 
CAGCTCCATCACCTCCACCWICAGCAACAGCAACAGCAGCA 
CCTTACTTCCACCACCAACTACAGCAC^ 

GCTTCTACCGGAAACGCCGTTCCTVTCTTCC^CAATGGGCTTTTCCCTCCGCAGCCTCAG 
CCACAGCACCAGCCTAATGATGGGTCATCTTCTCTCGCGGTGTACCCTCATTCAGTTCCG 
TCCTCGGCTGTGACGGCGCCGATGGAGCCGGTAAAGAGGAAGAGGGGTCGACCAAGAAAG 
TATGTGACGCCGGAACAAGCCCTAGCGGCTAAGAAATTGGCGTCTTCTGCGAGTAGTTCG 
TCTGCTAAACAGAGGCGAGAGCTTGCTGCTGTTACCGGTGGTACGGTATCGACTAATTCC 
GGGTCATCCAAGAAATCTCAGCTTGGTTCTGTCGGGAAAACTGGACAATGTTTTACTCCG 
C^TATTGTTAATATAGCTCCTGGCGAGGATGTGGTCCAGAAAATTATGATGTTCGCAAAC 
CAZUVGCAAGCATGAACTATGCGTTCTTTCT^ 

CGCCAACCGGCTCCATCAGGAGGCAACTTACCATATGAGGGTCAATACGAGATTCTCTCIA 
CTATCTGGATCCTATATCCGAACTGAAC^GGTGGTAAATCCGGCGGCCTTAGCGTTTCT 
TTATCTGCTTCAGATGGTCAGATCATCGGTGGAGCGATTGGTAGCCATCTCACAGCTGCT 
GGCCCGGTTCAGGTGATTCTTGGTACGTTTCAGCTTGATAGAAAGAAGGATGCCGCCGGG 
AGTGGTGGGAAAGGGGATGCTTCAAACIAGTGGAAGTCGGTTAACTTCTCCTGTAAGCTCT 
GGACAGTTGCTTGGCATGGGTTTCCCTCCTGGTATGGAATCTACGGGAAGAAATCCAATG 
AGGGGAAACGACGAGCAACATGATCATCATCATCATCAAGCCGGTTTGGGTGGACCTCAT 
CATTTCATGATGCAAGCGCCGCAGGGGATACACATGACACATTCCAGGCCATCTGAATGG 
CGCGGAGGAGGCAACAGCGGTCATGATGGCAGAGGCGGTGGCGGGTATGATTTGTCAGGA 
AGGATAGGACATGAGTCGTCGGAGAATGGAGATTACGAGCAGCAAATACCGGATTAGCAG 
AGCTTCCAGGAGAAGTGTGTAGAGTTTAGATCCCAAGTAGAGAAACAGAAGGCGAGCAAA 
GAATCTGAACTGAGAGAGGACTTATTAGACAGAGACTCGTCTGAAGGGTCTTTAATCATA 
GAAAGAAGTTGCTGAGTGATTGCTTTTGTTCTTCTTCTTGGTACGGTGTATTATATTAAC 
TCCACAACCTTTTTTTTATACTO 

TTTTTTTATACTCTTTTTCTTTTCTTATAATATTTTTTTTGGTTTTTCTTTCGTTTGTTA 

CTAAAAAAGGAAATGCTCTTTTTGTGAAATATATACACTTCGTTTG 

>G2154 Amino Acid Sequence (domain in AA coordinates : 97-119) 

itopneshhhhqqqqlhhlhqqqqqqqqqqrltspy 

pssnwglfppqpqpqhqpndgssslawphsvpssavtapmepvkrkrgrprkyvtpeqa 
LAAKKLAS S AS S S SAKQRREIiAAVTGGTVSTNSGS s kks qlgs vgktgqcftphi vni ap 
GEDWQKII^FANQSKHELCVLSASGTISNASI,RQPAPSGGNIiPYEGQYEILSIiSGSYIR 
TEQGGKSGGLSVSLSASDGQIIGGAIGSHIjTAAGPVQVILGTFQLDRKKDAAGSGGKGDA 

snsgsrltspvssgqllgmgfppgmestgrnpmrgndeqhdhhhhqaglggphhfmmqap 
qgihmthsrpsewrgggnsghdgrggggydlsgrighessengdyeqqipd* 

>G2157 (306. .1238) 
TCTTTTGATTTTAACCTTTTTTCAGTA^ 

CCTTTTATGATAAAGGTATGATGATAGCAAACAAATGATACCCCCATGTCTTGTGTGTCT 
GCTTCATGCAACATGTTGGTTTGGATTTGGTTAATCTAAAAGTTTAAGATAAGGTTTTCG 
GATTCTCTTCCTGTCTTGTAATAGTTTCTTGTCGGAGAGCCATCAACACCAACTTCAACA 
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AAAAAAACAAGAAAAAGAAAAAGATTCTCTTTCTCGTTTTATTTCCATTAGAGAAGAAAA 

AAAGAATGGCGAATCCTTGGTGGGTAGGGAATGTTGCGATCGGTGGAGTTGAGAGTCCAG 

TGACGTCATCAGCTCCTTCTTTGCACCACAGAAACAGTAACAACAACAACCC^ 

TGACTCGTTCGGATCCAAGATTGGACCATGACTTCACCACCAACAACAGTGGAAGCCCTA 

ATACCCAGACTCAGAGCCAAGAAGAACAGAACAGCAGAGACGAGCAACCAGCTGTTGAAC 

CCGGATCCGGATCCGGGTCTACGGGTCGTCGTCCTAGAGGTAGACCTCCTGGTTCCAAGA 

ACAAACGAAAGAGTCCAGTTGTTGTTACCAAAGAAAG 

TTCTTGAGATTGCTACGGGAGCTGACGTGGCGGAAAGCTTAAACGCCTTTGCTCGTAGAC 
GCGGCCGGGGCGTTTCGGTGCTGAGCGGTAGTGGTTTGGTTACTAATGTTACTCTGCGTC 
AGCCTGCTGCATCCGGTGGAGTTGTTAGTTTACGTGGTCAGTTTGAGATCTTGTCTATGT 
GTGGGGCTTTTCTTCCTACGTCTGGCTCTCCTGCTGCAGCCGCTGGTTTAACCATTTACT 
TAGCTGGAGCTCAAGGTCAAGTTGTGGGAGGTGGAGTTGCTGGCCCGCTTATTGCCTCTG 
GACCCGTTATTGTGATAGCTGCTACGTTTTGCAATGCCACTTATGAGAGGTTACCGATTG 
AGGAAGAACAAGAGCAAGAGCAGCCGCTTCAACTAGAAG^ 

AGAATGATGATAACGAGAGTGGGAATAACGGAAACGAAGGATCGATGCAGCCGCCGATGT 

ATAATATGCCTCCTAATTTTATCCCAAATGGTC^TCAAATGGCTC^^CACGACGTGTATT 

GGGGTGGTCCTCCGCCTCGTGCTCCTCCTTCGTATTGATTAGTTAGATAG6CGGTGGTTG 

GTGCGTTCTTTTTAC^GGAATGATTATATTTT 

TAAAGCTATCAAGTTTCTITTTTTTTT^ 

GTTTGTTTGTTTTGTGGCX3GCTTTTCTGACTTGACTATTTTGATCGCGGATAGCTTTGTA 

TGAAAGTGAATTGATTGTAGAATCGTCTTTTGAATTTTGATGTTGGAAAAAACCAA 

>G2157 Amino Acid Sequence (domain in AA coordinates: 82-102, 164-107) 

•MANPWWVGNVAIGGVE S PVTS S APSLHHRNSNNNNPPTMTRSDPRIiDHDFTTNNSGS PNT 

QTQSQEEQNSRDEQPAVEPGSGSGSTGRRPRGRPPGSKNKPKSPVVVTKESPNSLQSHVIj 

EIATGADVAESLNAFARl^GRGVSVLSGSGLVTNVTLRQPAASGGVVSLRGQFEILSMCG 

AFLPTSGSPAAAAGLTIYLAGAQGQWGGGVAGPLIASGPVIVIAATFCNATYERLPIEE 

EQQQEQPLQLEDGKKQKEENDDNESGNNGNEGSMQPPMYNMPPNFIPNGHQMAQHDVYWG 

GPPPRAPPSY* 

>G2181 (1..1005) 

ATGATGCTTGCGGTGGAAGATGTGTTAAGCGAACTCGCCGGAGAAGAAAGGAACGAGAGA 

GGATTGCCACCTGGCTTCCGGTTTCACCCGACGGACGAAGAGCTCATTACCTTCTACTTA 

GCTTCCAAAATCTTCCATGGTGGTCTCTCCGGCATTCACATTTCCGAAGTTGATCTCAAC 

CGCTGTGAACCTTGGGAGCTACCAGAAATGGCGAAGATGGGAGAGAGAGAGTGGTACTTT 

TATAGTCTAAGGGACAGGAAATATCCGACAGGTTTGAGGACTAACAGAGCAACTACTGCT 

GGATACTGGAAAGCTACCGGCAAAGATAAGGAAGTCTTCTCCGGCGGAGGAGGACAGCTT 

GTTGGGATGAAGAAGACGTTGGTGTTCTACAAAGGTAGGGCTCCACGTGGCCTCAAGACT 

AAGTGGGTCATGCATGAGTATCGCCTCGAAAACGACCATTCACACCGCCACACGTGTAAG 

GAGGAATGGGTGATTTGGAGAGTGTTCAATAAAACAGGA^ 

ATCCATAACCAAATCAGCTACCTTCATAAC 

CATGAAGCCTTACCTTTGCTTATAGAACCT^ 

CTACTCTACGATGATCCACACGWU^CTACAATA - 

GGCCACAACATCGACGAGCTC!AAAGCCTTAATCAACCCTGTCGTCTCTCAGCTCAACGGT 

ATCATCTTTCCTTCAGGGAACAACAACAACGACGAAGACGACTTCGACTTTAACCTCGGC 

GTGAAAACAGAGC^GTCTTCGAACGGTAACGAAATTGACGTACGAGATTACTTGGAGAAC 

CCTCTGTTTCAGGAAGCGAGTTATGGTCTGTTGGGTTTTTCGTCTTCTCCTGGACCTCTT 

CACATGCTACTAGATTCTCCATGTCCTTTAGGATTCCAGCTGTAG 

>G2181 Amino Acid Sequence (conserved domain in AA coordinates : 22-169) 
MMLAVEDVLSEIiAGBERNERGLPPGFRFHPTDEELITFYLASKI FHGGLSGIHISEVDIjN 
RCEPWELPEMAKMGEREWYFYSLRDRKYPTGLRTNRATTAGYWKATGKDKEVFSGGGGQIj 
VGMKKTLVFYKGRAPRGLKTKWVMHEYRLE]^ 
IHNQISYLHNHSLSTTHHHHHEALPLLIEPSNKTL^ 

GHNIDEIiKALINP WS QLNGI I FPSGNNNNDEDDFD FNLGVKTEQS SNGNE IDVRDYLEN 
PLFQEAS YGLLGFS S S PGPLHMLIJDS PCPLGFQL * 
>G221 (115, .795) 

CTCTCTTATTCTCTCACTCTTTTTTTTTTATATTCCTCTCTCTCTAAATCTATAAAATAT 
ATTTAAAAACTTGATCGTATATAATAAAGTAAATAAAGAATAATAACAAAAAAAATGGAG 
AAAAGAGGAGGAGGAAGTAGTGGAGGTTCGGGATCATCAGCAGAAGCAGAAGTGAGAAAA 
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GGACCATGGACGATGGAAGAAGATOTTATTCTTATCAACTATATCGCCAACCACGGCGAT 
GGTGTTTGGAATTCTCTCGCCAAATCTGCAGGTCTAAAACGAACCGGGAAAAGTTGCCGG 
CTCCGGTGGCTGAACTATCTCCGCCCCGACGTACGACGGGGAAACATCACTCCAGAAGAG 
CAACTTATCATCATGGAACTTCATGCTAAGTGGGGAAACAGGTGGTCGAAAATCGCC7VAA 
CATCTTCC^GGAAGAACGGACP^CGAGATCAAA 

TACATCAAGCAATCGGATGTAACAACAACATCGTCCGTTGGATCTCATCATAGCTCAGAG 
ATCAACGATCAAGCTGCAAGCACGTCGAGCCATAATGTCTTTTGTACACAAGATCAAGCG 
ATGGAGACTTATTCTCCTACACCGACATCATATCAACATACCAATATGGAATTCAACTAT 
GGTAACTATTCGGCCGCGGCAGTGACGGCAACCGTGGATTATCCAGTACCGATGACCGTT 
GATGATCAAACCGGTGAAAACTATTGGGGCATGGATGATATTTGGTCATCAATGCATTTA 
TTGAATGGTAATTGATTGATCGGTGGACAAAACATGGAATATTAATTGAGTATTATATAT 
GATTTTTAGGAGTACTATTATTAGTACGTGACATGTATATGTTTTTGCCTCGTTGTAGAG 
GTTTGGGGTTATAATTAATATATAATGTTATCTAATATGCAACCTTGATACATATTTGGA 
TCTTTATTGAACCCATGTTATACATAAATAAAATTGTTGAAGGGGTCATAAAAAAAAAAA 
AAAAAAAAAAAAA 

>G221 Amino Acid Sequence (domain in AA coordinates: 21-125) 

MEKRGGGSS GGSGS S AEAEVRKGPWTMEEDL IL INYI ANHGDGVWNS LAKS AGLKRTGKS 

CRIiRWLNYLRPDVRRGNITPEEQLIIMEIiHAKWGimW 

QKYIKQSDVTTTSSVGSHHSSEI1TOQAASTSSHNVFCTQDQAMETO 

NYGNYSAAAVTATVDYPVPMTVDDQTGENYWGMDDIWSSMHLIiNGN* 

>G2290 (119.. 982) 

TTCn^TTCT TTCTT T CTT TCTCTTCCA^ 

TCTCTACCTCTCTTTCTCTATCTTCTCTTATCAgTACTTCTCTCGCC 

GAACGATCCTGATAATCCCGATCTGAGCAACGACGACTCTGCTTGGAGAGAACTCACACT 
CACmGCTCAAGATTCTGACTTCTTCGACCGAGACACTTCCAATATCCTCTCTGACTTCGG 
TTGGAACCTCCACGACTCCTCCGATCATCCTC^ 

ACAAACCACCGGAGTCAAACCTACCACCGTCACTTCTTCTTGTTCCTCATCCGCCGCCGT 
TTCCGTTGCCGTTACCTCTACTAATAATAATCCCTCAGCTACCTCAAGTTCAAGTGAAGA 
TCCGGCCGAGAACTCAACCGCCTCCGCCGAGAAAACACCACCACCGGAGACACCAGTGAA 
GGAGAAGAAGAAGGCTCAAAAGCGAATTCGGCAACCAAGATTCGCATTCATGACCAAGAG 
TGATGTGGATAATCITGAAGATGGATATCGATGGCGTTU^TATGGAC^AAAAGCCGTC^^ 
GAATAGCCGATTCCCAAGGAGCTACTATAGATGCACAAACAGCAGATGCACGGTGAAGAA 
GAGAGTAGAACGTTCATCAGATGATCCATCGATAGTGATCACAACATACGAAGGACAACA 
TTGCCATCAAACC^TTGGATTCCCTCGTGGTGGAATCCTC^CTGCAC^CGACCC^CATAG 
CTTCACTTCTCATC^TCATCTCCCTCCT 

CCTTCATCAACTTCAC^GAGACAATAATGCTCCTTCACCGCGGTTACCCCGACCTACTAC 

TGAAGATACACCTGCCGTGTCTACTCCATCAGAGGAAGGCTTACTTGGTGATATTGTACC 

TCAAACTATGCGCAACCCTTGAGGTAAGCTTGGTACGTAGCAATAGCTAAGGAGdTGCTA 

ACTCATTATATATAGAAGATATTGG&GACCAG^ 

GGCGTTGTAACAATGGATCTATATATTACCTGATTG'^ 

CGTTTGCAATTTCTTCATGTATATTTCTTGTTATATATGTAGTTATATATCCAGGTATAA 
TTTTGATGTAACACAACATTAATCTTAATCGTGGATCCATCCCACATTTGATGCATGTAT 
GTGCACTTAAGAAAAAGAACATGGAGGAAATAACGTTATTTTTTATTATTCT 

>G2290 Amino Acid Sequence (conserved domain in AA coordinates : 147-205) 

MNDPDNPDLSMDDSAWREIiTIjTAQDSDFFDI^TSNILSDFGWNIiHHSSDHPH 

TQTTGVlCPTTvTSSCSSSAAVSvAv^ 

KEKKKAQKRIRQPRFAFMTKSDVDNLEDGYRWRKYGQKAVia^ 

KRVERSSDDPS I VPf TYEGQHCHQTIGFPRGGILTAHDPHSFTSHHHLPPPLPNPYYYQE 

LLHQLHRDIMAPSPRIjPRPTTEDTPAVSTPSEEGLLGDIVPQTMRNP* 

>G2299 (231.. 941) 

GCCAAAATTTTACCAACATTTTTCTCTTCT 
CACACTTCACTGCCCTGTTTTTTTTCCT^ 

TCCCCCTGAAGCCTAGCTATTTCTTTTTATTTGCATTAATCTCGGGATCCGAATCGAAAA 
AAGCAATCAGAATAATAGACTTGTACGATACTTGTGCCTAAGCTAACACAATGGCAGAGG 
AATACTACAGCCTCCGCTCGGAGAGAGTAACTCAGCTTCTTGTCCCTAACTCGGAGTCTG 
ACTCAGTGAGTGACAAAAGCAAAGCTGAGCAAAGCGAGAAGAAGACTAAACGTGGGAGAG 
ACTCCGGTAAACACCCTGTTTATCGCGGAGTAAGGATGAGGAACTGGGGAAAATGGGTGT 
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CGGAGATTCGTGAGCCGAGGAAGAAATCACGTATTTGGCTGGGAACTTTCCCGACGCCGG 

AGATGGCGGCGCGTGCACACGACGTGGCGGCTCTGAGCATTAAAGGAACGGCCGCTATAC 

TAAACTTCCCTGAACTCGCTGACTCATTCCCTCGACCCGTTTCATTAAGCCCTCGAGACA 

TTCAGACAGCAGCTCTTAAAGCAGCTCACATGGAACCGACGACGTCGTTTTCATCTTCCA 

CGTCTTCGTCGTCGTCTTTGTCTTCTACGTCTTCGCTCGAGTCTCTTGTGTTGGTGATGG 

ACCTCTCGAGGACTGAGTCGGAGGAGCTCGGTGAGATTGTGGAGCTTCCAAGTCTCGGGG 

CGAGTTACGACGTCGACTCGGCTAACCTTGGGAACGAGTTTGTCTTCTATGACTCAGTTG 

ACTACTGTTTATATCCGCCGCCGTGGGGACAGTCGTCCGAAGATAACTATGGTCACGGAA 

TTAGCCCTAATTTTGGCCATGGCT^GTCATGGGATCT 

ACCATAATGTTTTGTTTAAAACAGTTTATTTTGTA 

CACGTTTTTAAAACCCTTTGCTGTa"rTTGTTTaU"l , rTTTGAG d l ,v rri'T 

>G2299 Amino Acid Sequence (conserved domain in AA coordinates : 48-115) 
MAEEYYSLRSERVTQLLVPNSESDSVSDKSKAEQSE 

KVA7SEIREPRKKSRIWLGTFPTPEMAARAHDVAAIiSIKGTAAILNFPEIiADSFPRPVSLS 
PRDIQTAALKAAHMEPTTSFSSSTSSSSSLSSTSSLESLVLVMDLSRTESEELGEIVELP 
SLGASYDVDSANLGNEFVFYDSVDYCIiYPPPWGQSSEDNYGHGISPNFGHGLSWDL* 
>G2340 (274.. 1275) 

ATACAAAACTCCCTCTTCTCTATCTTCTTCATCTTAAAGAAAAAATAAGAGATATTCGTA 
AAGAGAGAACACAAAATTTCAGTTTACGAAAAGCTAGCAAAGTCGAGTATCGAGGAATAA 
CAGAATAAGACGTATCTATCCTTGCCTTAATGTTCTTACCAAAAGATCTAGTCCTTTCTT 
TGTATGATCGATCCATCAGAAGCCCACAACAAC^ 

AGCTTCTATTTTTAATACATTCAAGAATCi^AGAATGGTACGGACGCCGTGTTGTAGAGCA 

GAAGGGTTGAAGAAAGGAGCATGGACTCAAGAAGAAGACCAAAAGCTTATCGCCTATGTT 

CAACGACATGGTGAAGGCGGTTGGCGAACCCTTCCGGACAAAGCTGGACTC^ 

GGCAAAAGCTGCAGATTGAGATGGGCGAATTACTTAAGACCTGACATTAAACGTGGAGAG 

TTTAGCCAAGACGAGGAAGATTCCATCATCAACCTCCACGCCATTCATGGCAACAAATGG 

TCGGCCATAGCTCGTAAAATACO^GAAGAACAGACAATGAGATCAAGAACCATTGG 

ACTCACATCAAGAAATGTCTGGTCAAGAAAGGTATTGA 

CTCGATGGAGCCGGTAAATCATCTGACCATC^ 

GACGACAAAGATGATCAGAATTCAAATAAGAAAA^ 

TTTTTGAACAGAGTAGCAAACAGATTCGGTCATAGAATCAACCACAAT 

ATTATTGGAAGTAATGGCCTACTTACTAGTCACACTACTCCAACTACAAGTGTTTCAGAA 

GGTGAGAGGTGAACGAGTTCTTCCTCCACACAT^ 

AGCATAACCGTTGATGCAACATCTCTATCCTCATCCACGTTCTCTGACTCCCCCGACCCG 
TGTTTATACGAGGAAATAGTCGGTGAC^TTGAAGATATGACGAGATTTTCATCAAGATGT 
TTGAGTCATGTTTTATCTCATGAAGATTTATTGATGTCCGTTGAGTCTTGTTTGGAGAAT 
ACTTCATTCATGAGGGAAATTACAATGATCTTTCAAGAGGATAAAATCGAGACGACGTCG 
TTTAATGATAGCTACGTGACGCCGATCAATGAAGTTGATGACTCCTGTGAAGGGATTGAC 
AATTATTTTGGATGAGTTATATTGATGATGATGA 

TTAGAGTTTGATTTGCTATGGTGTTTTTAGTTTGTGTGTGTAGTGTGTTTCGACCGTCAA 
AAAAAAAAAAAAAAAAAAAAAAAAAA 

>G2340 Amino Acid Sequence (domain in AA coordinates : 14-120) 
MVRTPCCRAEGLKKGAWTQEEDQKLIAW^ 

LRPD I KRGEFSQDEEDS I INLHAIHGNKWS AI ARKI PRRTDNE I KNHWNTHI KKCLVKKG 
IDPLTHKSLIjDGAGKSSDHSAHPEKSSvHDDKDDQNSI^KK^ 

RINHNVIjSDIIGSNGLLTSHTTPTTSVSEGERSTSSSSTHTSSNLPINRSITVDATSLSS 
STFSDSPDPCIjYEEIVGPIEDMTRFSSRCLSHVLSHEDLIjMSvFISCLENTSFMREITMIF 

qedkiettsfndsyvtpinevddscegidnyfg* 

>G2346 (1..1011) 

ATGGAGTTGTTAATGTGTTCGGGTCAGGCCGAGTCAGGTGGTTCTTCTTCCACCGAGTCT 
TCTTCACTCAGTGGTGGACTCAGGTTTGGTCAGAAGATCTACTTCGAGGATGGATCCGGA 
TCCAGAAGCAAGAACCGGGTCAATACCGTTCGTAAGTCGTCTACCACGGCGAGGTGCCAA 
GTGGAAGGTTGTAGAATGGATCTAAGCAATGTTAAAGCTTATTACTCGAGACACAAAGTT 
TGTTGCATTGACTCTAAATCATCTAAAGTGAT^ 

CAACAATGTAGCAGGTTTCACCAGCTTTCTGAGTTTGACTTGGAGAAAAGAAGTTGTCGC 
AGAAGACTCGCTTGTC^TAACGAACGACGAAGAAAACCACAACCCACAACGGCTCTTTTC 
ACTTCTCATTACTCTCGAATCGCTCC^TCTCTTTACGGAAACCCCAATGCTGCAATGATT' 
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AAAAGCGTTTTGGGAGATCCTACTGCGTGGTCAACCGCAAGATCAGTGATGCAGCGGCCT 
GGACCGTGGCAGATTAATCCAGTTAGGGAAACCCATCCACACATGAATGTTTTATCACAT 
GGAAGCTCAAGCTTTACTACATGTCCAGAGATGATAAACZAACAATAGCACAGATTCAAGC 
TGTGCTCTCTCTCTTCTGTCAAACTCAT^ 

ACAAATACATGGCGACCATCTTCTGGTTTCGACTCGATGATCTCATTCTCCGATAAGGTT 

ACAATGGCTCAGCCACCGCCCATTTCAACCCATCAGCCGCCCATCTCAACACATCAGCAG 

TACCTC^GCCAAACTTGGGAAGTCATCGCGGGCGAAAAGAGCAATTCACATTATATGTCT 

CCTGTGAGTCAAATCTCGGAGCCAGCAGATTTCCAGATAAGCAATGGCAGTGTGTCGCCC 

TATTCTCCTCCGTCCTTACTATCTCTTGTGTGCTACTTGCGGCCGCTATAG 

>G2346 Amino Acid Sequence (domain in AA coordinates: 59-135) 

MELLMCSGQAESGGSSSTESSSIjSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQ 

VEGCRMDLSNVKAYYSRHK^ 

RRLACHNERRRKPQPTTALFTSHYSRIAPSLYGNPNAAM IKS VLGDPTAWS TARS VMQRP 
GPWQINPVRETHPHMNVLSHGSSSFTTCPEMINNN^ 

TNTWRPSSGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYIiSQTWEVIAGEKSNSHYMS 

PVSQISEPADFQISNGSVSPYSPPSLLSLVCYLRPIi* 

>G237 (1..852) 

ATGGCGAAGACGAAATATGGAGAGAGACATAGGAAAGGGTTATGGTCACCTGT^AGAAGAC 

GAGAAGCTAAGGAGCTTCATCCTCTCTTATGGCCATTCTTGCTGGACCACTGTTCCC^ 

AAAGCTGGGTTACAAAGGAATGGGAAGAGCTGCAGATTAAGATGGATTAATTACCTAAGA 

CCAGGGTTAAAGAGGGATATGATTAGTGCAGAAGAAGAAGAGACTATCTTGACGTTTCAT 

TCTCCCTTGGGTAACAAGTGGTCGCAAATAGCTAAATTCTTACCGGGAAGAACAGACAAT 

GAGATAAAGAACTATTGGCACTCTCATTTGAAAAAGAAATGGCTCAAGTCTCAGAGCTTA 

CAAGATGCAAAATCTATTTCCCCTCCTTCGTCTTCATCATCATCACTTGTTGCTTGTGGA 

GAAAGAAATCCGGAAACCTTGATCTCGAATCACGTGTTCTCCCTCCAGAGACTTCTAGAG 

AACAAATCTT(^TCTCCCTCACAAGAAAGCAACGGAAATAACAGCCATC^ 

GCTCCTGAGATTCC^GGCTTTTCTTCTCTGAATGGCTTTCTTCTTCA 

GATTATTCCTCTGAGTTTACCGACTCTAAGCACAGTCAAGCTCCAAATGTCGAAGAGACT 

CTCTCAGCTTATGAAGAAATGGGTGATGTTGATCAGT^ 

AACAACAGCAACTGGACTCTTAACGACATTGTGTT^ 

CATGATATTTATAGAGAGGCTTCAGATTGTAATTCTTCTGCTGAATTCTTTTCTCCACCA 
ACAACGACGTAAATTGCGTTTATTGTAATGTAAATCAAATTTCTAAGGCAAAACCGGAAA 
AAAAAAAAAAAAAAAAAAAA 

>G237 Amino Acid Sequence (domain in AA coordinates: 11-113) 
MAKTKYGERHRKGL WS PEEDEKliRS F I LS YGHS CWTTVP I KAGLQRNGKS CRLRW INYLR 
PGLKRDMISAEEEETII/TFHSPLGNKWSQIAKFL^ 

QDAKSISPPSSSSSSLVACGERNPETLISNHVFSLQRLLENKSSSPSQESNGNNSHQCSS 
APEIPRIjFFSEWLSSSYPHTDYSSEFTDSKHSQAPNVEETLSAYEEMGDVDQFHYMEMMI 
KNSNWTLNDIVFGSKCKKQEHHIYREASDCNSSAEFFSPPTTT* 
>G2373 (48.-1199) 

GCAAAATCCTCAGATCGTCTTACCTTCTCCGAATCGATCGATTTTTCATGGAGGACGACG 
ACGAGATTCAGTCAATTCCATCTCCGGGAGATTCTTCCCTTTCACCACAAGCTCCTCCTT 
CTCCGCCGATTTTGCCAACAAACGACGTGACGGTGGCCGTCGTGAAGAAACCACAACCGG 
GGCTTTCTTCTCAATCTCCGTCCATGAACGCTTTAGCGTTAGTGGTTCATACTCCTTCTG 
TAACCGGTGGTGGTGGTAGCGGAAACAGAAACGGACGAGGAGGAGGAGGAGGAAGCGGTG 
GTGGTGGAGGAGGAAGAGATGATTGTTGGAGCGAAGAAGCTACAAAGGTTCTAATCGAAG 
CTTGGGGAGATCGATTCTCTGAACCAGGTA7\AGGAACTTTGAAGCAACAACATTGGAAAG 
AAGTAGCTGAGATTGTGAACAAGAGTCGTCAATGCAAATACCCTAAAACTGATATTCAGT 
GTAAGAACAGAATTGATACGGTGAAGAAGAAGTATAAGCAAGAGAAAGCTAAGATTGCTT 
CTGGTGATGGACCTAGTAAATGGGTTTTCTTCAAGAAGCTTGAGAGTTTGATTGGTGGTA 
CTACAACATTCATTGCTTCTTCAAAAGCTTCAGAGAAGGCTCCTATGGGAGGAGCTCTTG 
GGAATAGCCGTTCGAGTATGTTTAAACGGCAAACTAAAGGTAATCAGATTGTGCAGCAAC 
AACAAGAGAAGAGAGGCTCTGATTCGATGCGGTGGCATTTTAGGAAACGTAGTGCTTCTG 
AGACTGAGTCTGAGTCTGATCCTGAACCTGAGGCTTCTCCTGAGGAATCTGCTGAGAGTC 
TCCCACCTTTGCAACCGATTCAACCGCTTTCGTTTCATATGCCAAAGCGGTTGAAGGTGG 
ATAAGAGTGGAGGTGGAGGGAGTGGAGTTGGAGATGTGGCGAGGGCGATACTTGGATTTA 
CGGAAGCTTATGAGAAGGCGGAAACTGCTAAGCTTAAGTTAATGGCGGAACTGGAAAAGG 
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AGAGGATGAAATTTGCTAAAGAGATGGAGTTGCAGAGAATGCAGTTCTTGAAAACTCAAT 
TGGAGATAACACAGAACAATCAAGAAGAGGAAGAGAGGAGCAGGCAGCGAGGAGAAAGGA 
GGATCGTTGATGATGATGATGATCGCAATGGCAAGAATAACGGCT^ATGTAAGTAGCTGAC 
AATTGAACACACAAATGTTCCTATGATATTTGCTATGATAAGCTGGATTTTAGGTTTTGA 
TGG 

>G2373 Amino Acid Sequence (domain in AA coordinates : 290-350) 
MEDDDEIQSIPSPGDSSLSPQAPPSPPILPTNDVTVAVVKKPQPGLSSQSPSMNALALW 
HTPSVTGGGGSGIjTRNGRGGGGGSGGGGGGRDDCWSEEATKVIjIEAWGDRFSEPGKGTIiKQ 
QHWKEVAEIVNKSRQCKYPKTDIQCKiraiDT^ 

LIGGTTTFIASSKASEKAPMGGALGNSRSSMFKRQTKGNQIVQQQQEKRGSDSMRWHFRK 

RSASETESESDPEPEASPEESAESLPPLQPIQPLSFHMPKIUiKVDKSGGGGSGVGDVARA 

ILGFTEAYEKAETAKLKLMAELEKERMKFAKEI^ 

RGERRI VDDDDDRNGKNNGNVSS * 

>G2376 (39.. 1370) 

CACGAGCTTCTGACTCAGATCCGGCGATATCGAATTCCATGGAGGACGATGAAGACATCC 
GATCTCAGGGTTCCGATTCACCTGATCCGTCTTCCTCCCCGCCGGCGGGACGAATCACGG 
TTACGGTGGCTTCGGCAGGTCCGCCTTCTTATTCTCTGACTCCTCCGGGTAATTCGTCGC 
AGAAGGATCCGGATGCGTTGGCTCTGGCGCTGCTTCCGATTCAGGCCAGCGGTGGAGGGA 
ATAACAGCAGTGGGAGACCAACCGGCGGCGGCGGGAGGGAGGATTGTTGGAGCGAAGCAG 
CTACGGCTGTGTTGATTGATGCGTGGGGTGAGAGATACTTGGAGCTTAGCAGAGGGAATC 
TGAAGCAGAAGCACTGGAAAGAGGTGGCTGAGATTGTGAGCAGCAGAGAGGATTACGGTA 
AAATTCCCAAAACTGATATACAGTGTAAGAATAGGATCGATACGGTGAAGAAGAAGTATA 
AACAAGAGAAGGTGAGAATCGCTAACGGCGGTGGCCGTAGCAGATGGGTGTTCTTCGACA 
AGCTTGACCGTCTGATTGGATCAACGGCGAAGATCCCGACGGCAACTTCTGGAGTCAGCG 
GTCCTGTCGGAGGATTGCATAAGATTCCTATGGGTATTCCAATGGGAAGTCGTTCGAATC 
TGTACCATCAGC7VAGCTAAGGCTGCAACACCGCCTTTCAATAATCTTGACCGGTTAATTG 
GAGCTACGGCTAGAGTCTCAGCTGCTTCTTTCGGTGGCAGTGGTGGAGGAGGCGGAGGAG 
GATCTGTCAATGTACCTATGGGAATTCCGATGAGTAGCCGTTCAGCTCCGTTTGGACAGC 
AAGGGAGGACTCTGCC^CAGCAAGGTAGGAC^CTGCC^ 

TGGTGAAAAGGTGTAGTGAGTCGAAACGCTGGCGTTTCAGGAAGAGGAACGCTTCTGATT 
CAGACTCGGAATCTGAAGCAGCAATGTCAGATGATTCCGGTGACAGTTTACCACCTCCTC 
CTCTGTCGAAGAGGATGAAGACGGAGGAGAAGAAGAAGCAAGATGGTGATGGAGTGGGGA 
ACAAATGGAGGGAGCTGACTCGGGCAATCATGAGATTCGGTGAAGCTTATGAGCAAACAG 
AGAATGCGAAACTGCAACAGGTGGTTGAGATGGAGAAAGAGAGGATGAAGTTCTTGAAGG 
AGCTTGAGTTGCAGAGAATGCAGTTCTTTGTGAAGACTCAATTGGAGATATCACAACTTA 
AGCAGCAACATGGGAGGAGAATGGGAAACACCAGTAATGATCATCATCACAGCCGCAAGA 
ACAACATCAATGCGATTGTCAACAACAACAACGATTTGGGTAATAACTAGAATTTAGTGA 
TGCAGTGTCGTAATTGATATATTTTAGATTTGAG 

>G2376 Amino Acid Sequence (domain in AA coordinates : 79-178 , 336-408) 

MEDDEDIRSQGSDSPDPSSSPPAGRITVTVASAGPPSYSLTPPGNSSQKDPDALALALLP 

I QAS GGGNNS S GRPTGGGGREDCWSEAATAVL IDAWGERYLEI»SRGNLKQKHWKEVAE IV 

SSREDYGKIPKTDIQCKNRIDTVKKKYKQEKVRIANGGGRSRWVFFDKLDRLIGSTAKIP 

TATS GVSGPVGGLHKI PMG I PMGSRSNLYHQQAKAATP PFNNLDRL IGATARVS AAS FGG 

SGGGGGGGSVI^PMGIPMSSRSAPFGQQGRTLPQQGRTLPQQQQQGMMVICRCSESK^ 

RKRNASDSDSESEAAMSDDSGDSLPPPPLSKRMKTEEKKKQDGDGVGNKWRELTRAIMRF 

GEAYEQTENAKLQQVVEMEKERMKFLKELELQRMQFFVKTQLEISQLKQQHGRRMGNTSN 

DHHHSRKHtTINAIVNNNNDLGNN* 

>G24 (194.. 724^- 

CGGACGCGTGGGCAAATATTAAAATAAAAAGTGTCGGTGAATTCTCAATCTTTGTCTTCT 
TTCGTCGTCTCTTTAAAACTCCTCCGTCCCTCCTTATTATGTAACCGTCTCGCCGTCAAA 
TTTTCAAAATCTCTCCCTCCGTTCATAAACCCAGATCGAAATTTATGGTTTTGTAATTTT 
TTTACCGGCGGTTATGGAGACGGAAGCGGCGGTGACAGCGACGGTTACGGCGGCGACGAT 
GGGGATTGGGACGAGGAAGAGAGATCTGAAACCGTATAAAGGAATACGAATGAGGAAATG 
GGGGAAATGGGTGGCGGAGATACGGGAACCGAATAAGAGATCAAGGATCTGGTTAGGTTC 
TTATGCGACGCCTGAAGCGGCGGCGAGAGCTTACGACACTGCTGTTTTTTACCTCCGTGG 
TCCTTCAGCGAGGCTTAATTTTCCGGAGCTTTTGGCTGGACTTACTGTTTCTAACGGCGG 
AGGAAGAGGTGGTGATTTATCGGCGGCGTATATTAGGAGAAAAGCGGCGGAGGTTGGTGC 
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TCAGGTTGATGCGCTTGGAGCGACGGTGGTTGTGAATACCGGCGGCGAGAATCGCGGTGA 
TTACGAGAAGATTGAGAATTGTCGTAAGAGCGGTAACGGGTCATTGGAACGGGTCGATTT 
GAATAAATTACCCGACCCGGAAAATTCGGATGGTGATGATGACGAATGTGTGAAAAGAAG 
ATAGAAAAAATAAAAAGTAGTTGTAGAAGGAGAGACGAGAATGTTTGTCTTTAAGATGCG 
CTGTTGCCGCTAACATGCGCTTTCGATTTTAGTGTTAAAC^TGCGCCTCCATTGTTTTTG 
GGTTTTGTTTTCGTCGTCGATAATCAAAGATTTTAAAACAC^ 

TGTTACAAACTAGATTTGCATGATCTTTGTATTAACGAATAACGATTAAGTCCTAAA 
>G24 Amino Acid Sequence (domain in AA coordinates: 25-93) 
METEAAVTATVTAATMG I GTRKRDLKP YKG I RMRKWGKWVAE I RE PNKRS R I WLGS YATP 
EAAARAYDTAVFYIiRGPSARLNFPELLAGLTVSNGGGRGGDLSAAYIRRKAAEVGAQVDA 
LGATVVVOTGGENRGDYEKIENCRKSGNGSLERVDLNiajP^ 
>G2424 (1. .999) 

ATGAGGATGGAGATGGTGCATGCTGACGTGGCGTCTCTCTCGATAACACCTTGCTTCCCG 
TCTTCTTTGTCTTCGTCCTCAGATC^^ 

GAAGATGAACACCATTCGATGGATCAGACCACTTCATCGGACTACTTCTCTTTAAA 

GACAATGCTCAACATCTCCGTAGCTACTACACAAGTCATAGAGAAGAAGACATGAACCCT 

AATCTAAGTGATTACAGTAATTGCAACAAGAAAGACACAACAGTCTATAGAAGCTGT 

CACTCGTCAAAAGCTTCGGTGTCTAGAGGACATTGGAGACCAGCTGAAGATACTAAGCTC 

AAAGAACTAGTCGCCGTCTACGGTCCACAAAACTGGAACCTCATAGCTGAGAAGCTCCAA 

GGAAGATCCGGGAAAAGCTGTAGGCTTCGATGGTTTAACCAACTAGACCCAAGGATAAAT 

AGAAGAGCCTTCACTGAGGAAGAAGAAGAGAGGCTAATGCAAGCTCATAGGCTTTATGGT 

AACAAATGGGCGATGATAGCGAGGCTTTTCCCTGGTAGGACTGATAATTCTGTGAAGAAC 

CATTGGCATGTTATAATGGCTCGCAAGTTTAGGGAACAATCTTCTTCTTACCGTAGGAGG 

AAGACGATGGTTTCTCTTAAGCCACTGATTAACCCT^ 

GACCCTACCCGGTTAGCTTTGACCCACCTTGCTAGTAGTGACCATAAGCAGCTTATGTTA 
CCAGTTCCTTGCTTCCCAGGTTATGATCATGAAAATGAGAGTCCATTAATGGTGGATATG 
TTCGAAACCCAAATGATGGTTGGCGATTAGATTGC^ 

GATTTCTTAAACCAAACCGGGAAGAGTGAGATATTTGAAAGAATCAATGAGGAGAAGAAA 
CCACCATTTTTCGATTTTCTTGGGTTGGGGACGGTGTGA 

>G2424 Amino Acid Sequence (conserved domain in AA coordinates : 107-219) 

MRMEMVHADVASLSITPCFPSSLSSSSHHHYNQQQHCIMSEDQHHSMDQTTSSDY 

DNAQHLRS YYTSHREEDIWPNLSDYSNCNKKDTTVYRS CGHS S KASVS RGHWRP AEDTKL 

KELVAVYGPQMWNIiIAEKLQGRSGKSCRLRWFNQLDPRINRRAFTEEEEERLMQAHRIjYG 

NKWAMIARLFPGRTDNSVKNHWHVI 

DPTRIiALTHLAS SDHKQLML PVPC FPG YDHENE S PLMVDMFETQMMVGD YI AWTQEATT F 

DFLNQTGKSEIFERINEEKKPPFFDFLGLGTV* 

>G2505 (1..1Q26) 

ATGGGTTCTTCGTCGAACGGAGGAGTGCCACCTGGTTTCCGGTTTCATCCGACGGACGAA 
GAGCTTCTCCATTACTACTTGAAGAAGAAAATCTCTTACCAAAA(3TTTGAGATGGAAGTC 
ATCAGAGAGGTTGACTTAAACAAGCTTGAGCCTTGGGATTTGCAAGAGAGATGCAAGATA 
GGATCAACACCACAAAACGAATGGTACTTCTTCAGCCACAAGGAC^ 

GGGTCAAGGACCAACCGTGCTACTCATGCAGGGTTCTGGAAGGCGACGGGACGTGACAAG 

TGCATAAGGAACTCTTACAAAAAGATAGGAATGAGGAAGAGACTTGTGTTCTAGAAAGGT 

AGAGCTCCTCATGGCCAAAAGACTGATTGGATCATGCATGAGTACCGTCTTGAAGACGCT 

GATGATCCTCAAGCCAACCCTAGTGAAGATGGATGGGTGGTATGTAGAGTGTTTATGAAG 

AAAAATTTGTTCAAGGTAGTAAATGAAGGTAGCTCAAGCATTAACTCATTGGACCAACAC 

AACCATGACGCATCTAACAACAACCATGC^CTTCAAGCTCGTAGCTTTATGCAC 

AGTCCATACCAGCTAGTACGTAACCACGGAGCCATGACATTCGAACTTAACAAGCCTGAC 

CTTGCTCTTCATCAATACCCACCAATCTTCCACAAGCCACCTTCAC 

TCTTCAGGACTTGCAAGGGACAGTGAGAGTGCGGCTAGTGAAGGGTTACAATACCAGCAA 
GCGTGTGAGCCGGGTTTAGACGTTGGTACATGTGAGACAGTGGCTAGTCATAATCATCAA 
CAAGGTCTAGGTGAATGGGCAATGATGGATAGACTTGTGACTTGTCACATGGGAAATGAA 
GATTCCTCTAGAGGGATTACGTATGAGGATGGTAACAACAATTCGTCCTCTGTGGTTCAG 
CCAGTTCCCGCGACGAACCAGCTAACATTGCGTAGTGAGATGGATTTCTGGGGTTATTCT 
AAATAG 

>G2505 Amino Acid Sequence (domain in AA coordinates: 10-159) 
MGSSSNGGVPPGBllFHPTDEEIiLHYYLKKKISYQKFEMEVIREVDIjNKLEPWDLQERC^ 
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GSTPQNEWYFFSHKDRKYPTGSRTNRATHAGFWKATGRDK^ 

RAPHGQKTDWIMHEYRLEDADDPQAN^ 

NHDASNNNHALQARSFMHRDSPYQL^ 

SSGLARDSESAASEGIjQYQQACEPGIjDVGTCETVASHNHQQG^ 
DSSRGITYEDGMNNSSSWQPVPATNQIiTLRSEMDFWGYSK* 
>G2512 (64.. 798) 

AACTTAGTGCCACTTAGAC^C^TAAGAAAACCGTTAACAAGAAGAAAAAAAAAAGATCG 
AAAATGGAATATCAAACTAACTTCTTAAGTGGAGAGTTTTCCCCGGAGAACTCTTCTTCA 
AGCTCATGGAGCTCACAAGAATCATTCTTGTGGGAAGAGAGTT^ 
GACCAATCCOTCCTTTTATCTAGCCCTACTC 

GAATCATCAATCATAAAAGAAGAAGGAAAAGAAGCCACCGTGGCGGCCGAGGAGGAGGAG 
AAGTCATACAGAGGAGTGAGGAAACGGCCGTGGGGGAAATTCGCGGCCGAGATAAGAGAC 
TCAACGAGGAAAGGGATAAGAGTGTGGCTTGGGACATTCGACACCGCGGAGGCGGCGGCT 
CTCGCTTATGATCAGGCGGCTTTCGCTTTGAAAGGCAGCCTCGCAGTACTCAATTTCCCC 
GCGGATGTCGTTGAAGAATCTCTCCGGAAGATGGAGAATGTGAATCTCAATGATGGAGAG 
TCTCCGGTGATAGCCTTGAAGAGAAAACACTCCATGAGAAACCGTCCTAGAGGAAAGAAG 
AAATCTTCTTCTTCTTCGACGTTGACATCTTCTCCTTCTTCCTCCTCCTCCTATTCATCT 
TCTTCGTCTTCTTCTTCTTTGTCGTCAAGAAGTAGAAAACAGAGTG 

GAAAGTAATACAACACTTGTGGTTCTTGAGGATTTAGGTGCTGAATACTTAGAAGAGCTT 
ATGAGATCATGTTCTTGATAATCTCTGCTTCTAGAATTTTTATGTAATTTGA 

>G2512 Amino Acid Sequence (conserved domain in AA coordinates : 79-139) 
METYQTNFIiSGBFSPENSSSSSWSSQESFLWEESFLHQSFDQSFLLSSPTDNYCDDFFAFE 
SSIIKEEGKEATVAAEEEEKSYRGVRKRPWGKFAAEIRDSTRKGIRVWLGTFDT 
AYDQAAFALKGSLAVLNFPADWEESLRKMENVOTjNDGESPV 

SSSSSTLTSSPSSSSSYSSSSSSSSLSSRSRKQSVWTQESNTTLVVIiEDLGAEYIjEELM 
RSCS* 

>G2513 (69.. 698) 

ttto^cagtaatttaagttaaccggagtctc 

tttgagttatgaataatgatgatattattctggcggagatgaggcctaagaagcgtgcgg 
gaaggagagtgtttaaggagac^cgtcacccagtt^^ 

gtgacaaatgggtctgcgaagtc^gagaaccgacgcaccaacgccgcatttggctcggga 

CTTATCCCACAG(^GATATGGCAGCGCGTGC^CACGACGTGGCGGTTTTAGCTCTGCGTG 
GGAGATCCGCATGTTTGAATTTCGCCGACTCCGCTTGGCGGCTTCCGGTGCCGGAATCCA 
ATGATCCGGATGTGATAAGAAGAGTTGCGGCGGAAGCTGCGGAGATGTTTAGGCCGGTGG 
ATTTAGAAAGTGGAATTACGGTTTTGCCTTGTGCGGGAGATGATGTGGATTTGGGTTTTG 
GTTCGGGTTCCGGCTCTGGTTCGGGATCGGAGGAGAGGAATTCTTCTTCGTATGGATTTG 
GAGACTACGAAGAAGTCTCAACGACGATGATGAGACTCGCGGAGGGGCCACTAATGTCGC 
CGCCGCGATCGTATATGGAAGACATGACTCCTACTAATGTTTACACGGAAGAAGAGATGT 
GTTATGAAGATATGTCATTGTGGAGTTACAGATATTAAGTGGGACTCACATATCTACTAT 
ACATAATATTTAGCTTTTATGTAAGAGGTATTTATGTGAGTTTTAAGATTGTAGATGTGT 
CCCAGGCGTTAGAAGTTTCCTTGATGGTATGGAATCTTTGTACCTATAAAATTATAAAAT 
T 

>G2513 Amino Acid Sequence (domain in AA coordinates: TBD) 

MNNDD I ILAEMRPKKRAGRRVFKETRHPVYRGI I WLGTYP 

TADMAARAHDVAVLALRGRSACIiNFADSAWRLPVPESlTDPDVIRRVAAEAAEM 

SGITVLPCAGDDVDLGFGSGSGSGSGSEERNSSSYGFGDYEEVSTTMMRLAEGPLMSPPR 

SYMEDMTPTNVYTEEEMCYEDMSLWSYRY* 

>G2519 (83..69i) 

CAAAGTGAAAACATAAGATCATCTTCTTCGTTGATAGATCAATATAGGAACTCCAGAAGA 

GAATCTTGATCAATTAAGTATCATGTCTCACIATCGCTGTTGAAAGGAATCGAAGAAGGCA 

AATGAACGAGCATCTTAAATCCCTTCGTTCTTTGACTCCTTGTTTCTACATCAAAAGGGG 

AGATCAAGCTTCGATCATCGGAGGAGTGATAGAGTTCATCAAAGAGTTGCAGCAAT^ 

TCAAGTTCTTGAGTCGAAGAAACGTCGAAAGACCCT^ 

TCACCAGACAATCGAGCCATCCAGTTTAGGAGCCGCCACTACCCGAGTACCGTTTAGTCG 
AATCGAAAATGTGATGACCACAAGTACTTTCAAGGAAGT^ 

TCATGCTAACGTAGAAGCAAAGATTTCAGGTTCTAATGTTGTATTGAGAGTTGTCTCTAG 
GCGAATCGTGGGGCAGCTCGTAAAGATCATCTCTGTCTTAGAGAAGCTATCTTTTCAAGT 
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TCTTCACCTCAATATTAGTAGCATGGAGGAGACTGTCTTATACTl^TTCGTTGTTAAGAT 
AGGATTGGAGTGTCACITAAGCTTGGAGGAGCTAACTCTTGAAG 

GTCTGATGAAGTGATCGTCTCTACCAATTAAAAACAAAATTCTACATGTACTAGAGCGTG 
TATCGTTTTTTGGGATTAATAATCATATAATCGTTACATGAGCCTTGATACTTTGCTAGA 
AATAAGCTCCTCTAAACAAAACCTTCTTTTTAAAAAAACACACTTATGTTTTACTTAGTT 
TGTTGTTGTATCCGAAGTTGATCAACGTTGTAATTTCCCACAATAAATCATGACATTTTA 
TATGCTCT 

>G2519 Amino Acid Sequence (domain in AA coordinates : 1-65) 

MSHIAVERNRRRQMNEHLKSLRSLTPCFYI KRGDQAS I IGGVIEFI KELQQLVQVIjESKK 

RRKTLNRPSFPYDHQTIEPSSLGAATTRVPFSRIENVMTTSTFKEVGA 

ISGSNVVLRWSRRIVGQIiVKIISVLEKIjSFQVim 

IiEELTLEVQKSFVSDEVIVSTN* 

>G2520 (133 . .1197) 

AAGGAGTTTTGCATACTCACCAAGCCACAAT 

TGAATCGGCGACGACTGAGTCAACTCGGTGTTGTTACTGGTTTCGTCGTATGTGTTGTAA 
CTGATTAAGTTGATGGATCCGAGTGGGATGATGAACGAAGGAGGACCGTTTAATCTAGCG 
GAGATCTGGCAGTTTCCGTTGAACGGAGTTTCAACCGCCGGAGATTCTTCTAGAAGAAGC 
TTCGTTGGACCGAATCAGTTCGGTGATGCTGATCTAACCACAGCTGCTAACGGTGATCCA 
GCGCGTATGAGTCACGCGTTGTCTC^GGCGGTTATTGAAGGTATCTCCGGCGCTTGGAAA 
CGGAGGGAAGATGAGTCTAAGTCGGCGAAGATCGTC^CCACCATTGGCGCTAGTGAAGGT 
GAGAAC^AAAGACAGAAGATAGATGAAGTGTGTGATGGGAAAGCAGAAGCAGAATCGCTA 
GGAACAGAGACGGAACAAAAGAAGCAAGAGATGGAACCAACGAAA^ 

CGAGCTAGAAGAGGTCAAGCTACTGATAGTCACAGTTTAGCTGAAAGAGCGAGAAGAGAG 

AAAATAAGTGAGCGGATGAAAATCTTGCAAGATCTTGTTCCGGGATGTAACAAGGTTATT 

GGAAAAGCACTTGTTCTAGATGAGATAATTAACTATATACAATCATTGCAACGTCAAGTT 

GAGTTCTTATCGATGAAGCTTGAAGCAGTG2\ACTCAAGAATGAACCCTGGTATCGA 

TTTCCACCCAAAGAGGTGATGATTCTCATGATCATCAACTCAATCTTCTCCATTTTTTTC 

ACAAAACAATACATGT1TCTATCGAGGTATTCTCGGGGTAGGAGTCTCGATGTTTATGCG 

GTTCGGTCATTTAAGCATTGCAATAAACGGAGTGACCTCTGTTTTTGCTCCTGCTCCCCA 

AAAACAGAACTTAAGAGAACTATATTTTCACAT^ 

CGAGTAGGAGTCGCTATTAGTTCATCTAAGCATTGCAATGAACCGTTTGGTCAGCAAGCG 
TTTGAGAATCCGGAGATACAGTTCGGGTCGCAGTCTACGAGGGAATACAGTAGAGGAGCA 
TCACCAGAGTGGTTGCACATGCAGATAGGATCAGGTGGTTTCGAAAGAACGTCTTGA 
>G2520 Amino Acid Sequence (domain in AA coordinates: 135-206) 
MDPSGMMNEGGPFNLAEIWQFPLN^^ 

HAIjSQAVIEGISGAWKRREDESKSAKIVSTIGASEGENKRQKIDEVCDGKAEAESLGTET 

EQKKQQMEPTKDYIHVRARRGQATDSHSLAERARREKISERMKILQD^ 

VLDE I INYIQSLQRQVEFLSMKLEAVNSRMNPGIEVFPPKEVMILMI INS I FSIFFTKQY 

MFLSRYSRGRSLDVYAVRSFKHCNKRSDLCFCSCSPKTELKTTIFSQNMTCFCRYSRVGV 

AISSSKHCNEPFGQQAFENPEIQFGSQSTREYSRGASPEWLHMQIGSGGFERTS* 

>G2533 (1..1080) 

ATGATAAGCAAGGATCCAATATCGAGTTTACCTCCAGGGTTTCGATTTCATCCAACAGAT 

GAAGAACTCATTCTCGATTACCTAAGGAAGAAAGTTTCCTCTTCCCCAGTCCCGCTTTCG 

ATTATCGCCGATGTCGATATCTACAAATCCGATCCATGGGATTTACCAGCTAAGGCTCCA 

TTTGGGGAGAAAGAGTGGTATTTTTTCAGTCCGAGGGATAGGAAATATCCAAACGGAGCA 

AGACCAAACAGAGCAGCTGCGTCTGGATATTGGAAAGCAACCGGAACAGATAAATTGATT 

GCGGTACCAAATGGTGAAGGGTTTCATGAAAACATTGGTATAAAAAAAGCTCTTGTGTTT 

TATAGAGGAAAGCGTCCAAAAGGTGTTAAAACCAATTGGATCATGCATGAATATCGTCTT 

GCCGATTC^TTATCTCCCAAAAGAATTAACTCTTCTAGGAGCGGTGGTAGCGAAGTTAAT 

AATAATTTTGGAGATAGGAATTOTAAAGAATATTCGATGAGACTGGATGATTGGGTTCTT 

TGCCGGATTTACAAGAAATGA.CACGCTTCATTGTCATCACCTGATGTTGCTTTGGTCACA 

AGCAATCAAGAGCATGAGGAAAATGACAACGAACCATTCGTAGACCGCGGAACCT^ 

CCAAATTTGCAAAATGATGAACCCCTTAAACGCCAGAAGTCTTCTTGTTCGTTCTCAAAC 

TTACTAGACGCTAC^GATTTGACGTTTCTCGCAAATTTTCTAAACGAAACC 

CGTTCTGAATCAGATTTTTCTTTCATGATTGGGAATTTCTCTAATCCTGACATTTACGGA 

AACCATTACTTGGATCAGAAGTTACCGCAGTTGAGCTCTCCCACTTCAGAGACAAGCGGC 

ATCGGAAGCAAAAGAGAGAGAGTGGATTTTGCGGAAGAAACGATAAACGCTTCGAAGAAG 
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ATGATGAACACATATAGTTACAATAATAGTATAGATCAAATGGATCATAGTATGATGCAA 

CAACCTAGTTTCCTGAACCAGGAACTCATGATGAGTTCn'CACCTTCAATATCAAGGCTAG 

>G2533 Amino Acid Sequence (conserved domain in AA coordinates : 11-186) 

MISKDPISSLPPGFRFHPTDEELILHYLRKKVSSSPVPLSIIADVDIYKSDPWDLPAK^ 

FGEKEWYFFSPRDRKYPNGARPNRAAASGYWKATGTDKLI^ 

YRGKPPKGVlCTNWIMHEYIUiADSLSPKRINSSRSG 

CRIYKKSHASLSSPDVALVTSNQEHEENDITO 

LIjDATDIjTFIjANFIjNETPENRSESDFSFMIGNFSNPDIYGNHYIiDQKIjPQLSSPTSETSG 

IGSKRERVBFAEETINASKKMMNTYSYNNSIDQMDHSMMQQP 

>G2534 (1.-975) 

ATGGATAATATAATGCAATCGTCAATGCCACCGGGATTCCGATTTCATCCGACAGAGGAA 
GAGCTTGTGGGTTATTACCTAGATAGGAAGATCAATTCZAATGAAGAGTGCTTTAGATGTC 
ATTGTAGAGATTGATCTCTACAAAATGGAGCCATGGGATATACAAGCGAGGTGTAAACTA 
GGGTATGAAGAGCAAAACGAGTGGTACTTCTTTAGTCATAAGGACAGGAAGTACCCTACC 
GGGACTAGGACCAACCGAGCCACTGCGGCTGGGTTCTGGAAAGCCACGGGTAGAGACAAG 
GCGGTACTATCAAAAAACAGTGTCATCGGAATGCGGAAGACACTTGTCTACTACAAGGGT 
CGAGCTCCTAATGGAAGAAAGTCCGATTGGATCATGCACGAATACCGTCTCCAAAACTCC 
GAGCTTGCCCCGGTTCAGGAGGAAGGCTGGGTGGTGTGTCGAGCATTTAGGAAGCCAATT 
CCAAACCAGAGGCCATTAGGGTACGAGCCATGGCAGAACC^GCTCTACCACGTCGAAAGT 
AGTAACAACTACTCATCTTCAGTGACAATGAACACGAGTC^ 

TCAAGTCATAACCTTAATCAAATGCTCATGAGCAATAACCACTACAATCCTAATAATACA 
TCCTCATCGATGCATCAATATGGCAACATTGAGCTCCCGCAGTTGGACAGCCCGAGCTTG 
TCGCCTAGTTTAGGGACGAATAAAGATCAGAACGAGAGTTTCGAGCAAGAAGAAGAGAAG 
AGCTTTAACTGTGTGGATTGGAGAACACTAGATACCTTGCTTGAGACACAAGTCATACAT 
CCGCATAACCCTAATATTCTTATGTTCGAAACGCAGTCGTATAATCCGGCGCCAAGCTTC 
CCTTCC^TGCATCAAAGCTATAATGAGGTCGAAGCTAATATTCATCATTCTCTTGGATGC 
TTCCCTGACTCGTAA 

>G2534 Amino Acid Sequence (conserved domain in AA coordinates : 10-157) 
MDNIMQSSMPPGFRFHPTEEELVGYYLDRKINSMKSALDVIVEIDLYKMEPWDIQARCKL 
G YEEQNEWYF F SHIODRKYPTGTRTNRATAAGFWKATGRDKAVIj S KNS VI GMRKTLVYYKG 
RAPNGRKSDWIMHEYRLQNSELAPVQEEGWVVCRAFRKP I PNQRPLGYE PWQNQLYHVE S 
SNKFYSSSvTMNTSHHI^ 

SPSIiGTNKDQNESFEQEEEKS FNCVDWRTLDTLLETQVIHPHNPNILMFETQS YNPAPS F 
PSMHQS YNEVEANIHHSLGCFPDS * 
>G2573 (34.. 957) 

CCAGATTTAATTTGAGACTCTCAAAGAAACAC(^TGGAAGAAGAGC^ACCTCCGGCCAAG 
AAACGAAACATGGGGAGATCTAGAAAAGGTTGCATGAAAGGTAAAGGCGGTCCAGAGAAC 
GCCACGTGTACTTTCCGTGGAGTTAGGCAACGGACTTGGGGTAAATGGGTGGCTGAGATC 
CGTGAGCCTAACCGTGGGACTCGTCTCTGGCTCGGCACGTTTAATACCTCGGTCGAGGCC 
GCCATGGCTTACGATGAAGCCGCTAAGAAACTCTATGGACACGAGGCTAAACTCAACTTG 
GTGCACCCAGAACAACAACAACAAGTAGTAGTGAACAGAAACTTGTCTTTTTCTGGCCAC 
GGGTCGGGTTCTTGGGCTTATAATAAGAAGCTCGATATGGTTCATGGGTTGGACCTTGGT 
CTCGGCCAGGCAAGTTGTTCACGAGGTTCTTGCTCAGAGAGATCGAGTTTTCTACAAGAA 
GATGATGATCATAGTCATAATCGATGTTCGTCTTCAAGTGGTTCGAATCTTTGTTGGTTA 
TTACCTAAACAAAGTGATTC^CAAGATCAAGAGACCGT^ 

GGTGAAGGCGGTGGTGGCTCTACGTTAACGTTTTCGACCAATTTGAAACCAAAGAATTTG 
ATGAGTCAGAATTATGGATTATACAATGGAGCTTGGTCTAGGTTTCTTGTGGGGCAAGAA 
AAGAAGACGGAACATOACGTGTCATCGTCGTGTGGATCGTCGGACAACAAGGAGAGTATG 
TTGGTTCCTAGTTGCGGCGGAGAGAGGATGCATAGGCCGGAGTTGGAAGAGCG7VACAGGA 
TATTTGGAAATGGATGATCTTTTGGAGATTGATGATTTAGGTT^ 

GGAGATTTCAAGAATTGGTGTTGTGAAGAGTTTCAACATCCATGGAATTGGTTCTGAGAG 
TTTTTATTTATTACTATTATTTATCATACATATTTCTTATA 

>G2573 Amino Acid Sequence (domain in AA coordinates: TBD) 

meeeqppakkrnmgrsrkgcmkgkggpenatctfrgwqr™ 
gtfntsv^^amaydeaakklygheakll^ 

DMVHGLDLGLGQASCSRGSCSERSSFLQEDDDHSHNRCSSSSGSNLCWIiLPKQSDSQDQE 
TVNATTSYGGEGGGGSTLTFSTNLKPKNIJ^SQNYGLYNGAWSRFIiVGQEKKTEHDVSSSC 
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GSSDNKESMLVPSCGGERMHRPELEERTGY^ 
QHPWNWF* 

>G2589 (23.. 1354) 

AAAGAAAAGAAAAATAAAGATAATGAGGACGAAGACTAAGTTAGTACTCATACCTGATAG 
ACACTTTCGGAGAGCCACATTCAGGAAGAGGAATGCAGGGATAAGGAAGAAACTCCACGA 
GCTGACAACTCTCTGTGACATCAAAGCATGTGCGGTAATCTACAGTCCGTTCGAGAATCC 
AACGGTGTGGCCGTCAACCGAAGGTGTTCAAGAGGTGATTTCGGAGTTCATGGAGAAGCC 
GGCGACAGAACGGTCCAAGACGATGATGAGTCATGAGACTTTCTTGCGGGACCAAATCAC 
CAAAGAAC7UUUICAAACTAGAGAGTCTACGTCGTGAAAACCGAGAAACTCAGCTTAAGCA 
TTTTATGTTTGATTGCGTTGGAGGCAAGATGAGTGAGCAACAGTATGGTGCAAGGGACCT 
TCAAGATTTAAGTCTTTTTACTGATCAATATCTTAATCAGCTTAATGCCAGGAAGAAGTT 
CCTTACAGAATATGGTGAGTCTTCTTCTTCTGTTCCTCCTCTGTTTGATGTTGCGGGTGC 
CAATCCTCCTGTTGTTGGAGATCAAGCTGCGGTAACTGTTCCTCCTTTGTTTGCTGTTGC 
GGGTGCCAATCTTCCTGTTGTTGCTGATGAAGCTGCGGTAACTGTTCCTCCTCTGTTTGC 
TGTTGCGGGTGCCAATCTTCCTGTTGTTGCAGATCAAGCTGCGGTTAATGTTCCTACTGG 
ATTTCATAACATGAATGTGAACGAGAATCAGTATGAGCCGGTT^ 

TGGTTTTAGTGATCATATTCAATATCAGAATATGAACTTCAATCAAAACCAACAAGAGCC 

GGTTCATTACCAGGCTCTTGCTGTTGCGGGTGCCGGTCTTCCTATGACTCAGAATCAGTA 

TGAGCCCGTTCACTACCAGAGTCTTGCTGTCGCGGGTGGCGGTCTTCCTATGAGTCAGTT 

GCAGTATGAGCCGGTTCAGCCTTATATCCCTACTGTTTTTAGTGATAATGTTCAATATCA 

GCATATGAATTTGTATCAAAATCAACAAGAGCCGGTTCACTACCAAGCTCTTGGTGTTGC 

AGGTGCGGGTCTTCCTATGAATCAGAATCAGTATGAGCCGGTTCAGCCCTATGTCCCTAC 

TGGTTTTAGTGATCATTTTCAGTTTGAGAATATGAATTTGAATCAAAATCA^ 

GGTTCAATACCAAGCTCCTGTTGATTTTAATCATCAGATTCAACAAGGAAACTATGATAT 

GAATTTGAACC^GAATATGAGTTTGGATCCAAATCAGTATCCGTTTCAAAATGATCCATT 

CATGAATATGTTGACAGAATATCCTTATGAATAAGCGGGTTATGTTGGAGAGCATGCAC 

>G2589 Amino Acid Sequence (domain in AA coordinates: TBD) 

MRTKTKLVLI PDRHFRRATFRKRNAGIRKiajHELTTIjCDI KACAVTYS PFENPTVWPSTE 

GVQEVISEFMEKPATERSKTMMSHETFLRDQITI^ 

GKMSEQQYGARDLQDLSXjFTDQYLNQLNARKKFIjTE YGES SS SVPPLFDVAGANPP WAD 

QAAVTVPPLF AVAGANIjP WADQAAVTVP PLFAVAG ANL P WAD QAAV1STVPTGFHNMNVN 

QNQYEPVQPYVPTGFSDHIQYQNI^FNQNQQEPVHYQAIiAVAGAGIiPMTQNQYEPVHYQS 

IiAVAGGGLPMSQLQYEPVQPYIPTVFSDNVQYQHMNLYQNQQEPVHYQALGVAGA 

QNQYEPVQPYVPTGFSDHFQFJ3NMNLNQNQQ 

IiDPNQYPFQ3STDPFMNMIjTE YP YE * 

>G2687 (45.. 1139) 

CTCTGTCTCTCGTATCTTTCTACTACTCTGTTTCTTGAATTCTAATGAACAACATCGACG 
ACGCAAAGACGGAGACTTCAGTGTCTTCAGGTTCAAGCGACTCTTTCTTGCCTCTCAAGA 
AACGGATGAGACTTGATGACGAACCAGAAAACGCCCTAGTGGTTTCGTCTTCACCAAAGA 
CGGTTGTGGCTTCTGGCAATGTCAAGTACAAAGGAGTCGTTCAGCAACAGAACGGTCATT 
GGGGTGCCCAGATTTACGCAGACCACAAAAGGATTTGGCTTGGAACTTTCAAATCCGCTG 
ATGAAGCCGCCACGGCTTACGATAGTGCATCTATCAAACTCCGAAGCTTTGACGCTAACT 
CGCACCGGAACTTCCCTTGGTCTACAATCACTCTCAACGAACCAGACTTTCAAAATTGCT 
ACACAACAGAGACTGTGTTGAACATGATCAGAGACGGTTCGTACCAACACAAATTCAGAG 
ATTTTCTCAGAATCAGATCTC^GATTGTTGCGAGTATCAAC^TCGGGGGACCAAAACAAG 
CCCGAGGAGAAGTGAATCAAGAATCAGACAAGTGTTTTTCTTC 

AGGAATTGACACCGAGCGATGTAGGGAAACTAAATAGGCTTGTGATACCTAAAAAGTATG 

CZAGTGAAGTATATGGCTTTCATAAGCGCTGATCAAAGCGAGAAAGAAGAGGGTGAAATAG 

TAGGATCTGTGGAAGATGTGGAGGTTGTGTTTTACGACAGAGCAATGAGACAATGGAAGT 

TTAGGTATTGTTACTGGAAAAGTAGCC^GAGCTTTGTCTTCACCAGAGGATGGAATAGTT 

TCGTGAAGGAGAAGAATCTCAAGGAGAAGGATGTTATTGCCTTCTACACTTGCGATGTCC 

CGAACAATGTGAAGACATTAGAAGGTCAAAGAAAGAACTTCTTGATGATCGATGTTCATT 

GCTTTTCAGACAACGGTTCCGTGGTAGCTGAGGAAGTAAGTATGACGGTTCATGACAGTT 

CAGTGCAAGTAAAGAAAACAGAAAACTTGGTTAGCTCCATGTTAGAAGATAAAGAAACCA 

AATCAGAGGAGAACAAAGGAGGGTTTATGCTGTTTGGTGTAAGGATCGAATGTCCTTAGG 

GAATTTTTCTTTAAAAGTTTCTTACTTCAACTAGAACTTGTTTTACTTGTACCT 

>G2687 Amino Acid Sequence (domain in AA coordinates: TBD) 
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MNNI DDAKTETS VS SGS SDS FLPLKKRMRLDDEPENAIiWS S S PKTWAS GNVKYKGWQ 
QQNGHWGAQI YADHKRIWLGTFKSADEAATAYDSAS I KLRSFDANSHRNFPWSTITLNEP 
DFQNCYTTETVLNMIRDGSYQHKFRDFLRIRSQIVASINIGGPKQARGEVNQESDKCFSC 
TQLFQKELTPSDVGKLNRIATCPKKYAVKYM^^ 
MRQWKFRYC YWKS S QS FVFTRG WNS FVKEK2JLKEKDVI AF YT 

MIDVHCFSDNGSWAEEVSMTVHDSSVQVKKTENLVSSMLEDKETKSEENKGGFMLFGVR 
IECP* 

>G27 (83. .622) 

CAAAATACGAAAAACAAAACATTTTTTTTAAT 

CGTTACATTAAATTATCTTTAGATGCAAGACTCTTCCTCTCACGAATCGCAA 
CCGGTCACCGGTGCCGGAGAAAACCGGAAAGAGTTCTAAGACTAAAAATGAGCAAAAAGG 
TGTTTCTAAACAACCAAATTTTCGTGGGGTCAGAATGAGACAATGGGGAAAATGGGTGTC 
TGAAATTAGAGAACCAAGAAAGAAATCAAGAATATGGCTCGGT^ 

GATGGCGGCGCGTGCACACGACGTGGCGGCTTTAGCCATCAAAGGTGGCTCTGCCCZACCT 
TAATTTCCCGGAGCTAGCTTACCATTTGCCGAGACCGGCTAGCGCGGACCCTAAAGACAT 
TCAAGAAGCCGCCGCCGCAGCAGCTGCCGTTGACTGGAAAGCACCGGAGTCTCCGTCTAG 
CACCGTGACGTCATCTCCAGTCGCCGACGACGCTTTCTCCGATCTTCCTGATCTTTTGCT 
TGACGTGAATGATCACAACAAAAACGATGGATTCTGGGACTCGTTTCCGTACGAAGATCC 
TTTCTTCTTGGAAAATTACTAGAAGGCAAATTCTTGC 

TTCCCGGTAAATAAGAAGACGATGTCGTTTTGTACCTTTTTTGTCTACGATGGGAAATTT 
CTTTTTTTTTTACGTGTGAGTAAAAGTTTCCGAATGTGTGATGTGTAAG 
TATTTAATTTCTTTTTTTTGTACAAATACGTACGTCATTACCAAAAAGTTTTCATTTATT 
GTGCTTTTATCTTCCAAATTCATTAAAAAAAAAAAAAAAAA 

>G27 Amino Acid Sequence (domain in AA coordinates: 37-104) 
MQDS S SHESQRNIiRS PVPEKTGKS SKTKNEQKGVS KQPNFRGVRMRQWGKWVSE IREPRK 
KSRIWLGTFSTPEMAARAHDVAALAIKGGSAHLNFPELA^^ 
AAVDWKAPESPSSTVTSSPVADDAFSDLPDLLLDVNDHNKNDGF 
>G2720 (1..894) 

ATGGAAGCGAAGAAGGAAGAGATAAAGAAAGGTCCATGGAAAGCCGAAGAAGACGAAGTA 
CTCATCAACCATGTCAAGAGATACGGTCCTCGTGATTGGAGCTCCATTCGATCCAAAGGT 
CTTCTTCAACGCACCGGCAAATCCTGTCGTCTTCGTTGGGTCAATAAACTCCGTCCCAAT 
CTCAAAAATGGATGCAAGTTCTCGGCTGACGAAGAGAGGACTGTGATTGAGTTAGAATCT 
GAGTTTGGTAACAAATGGGCGAGAATCGCTACGTATCTACCGGGAAGAACTGATAACGAT 
GTGAAGAATTTCTGGAGTAGCAGACAAAAGAGACTCGCTAGGATTCTTCATAACTCCTCT 
GATGGATCGAGTTCGAGTTTCAATCCCAAATCT 

AACGTGAAACCAATCCGTCAATCCTCTCAGGGTTTTGGTTTGGTTGAGGAAGAGGTTACA 

GTTTCTTCTTCATGTTCCCAGATGGTTCCTTATTCATCTGATCZAAGTTGGTGATGAAG 

TTGAGGTTGCCGGATTTGGGTGTTAAGTTAGAGCATCAGCCTTTCGCTTTTGGC^ 

CTTGTCCTAGCAGAGTACTCTGACTCACAGAATGATGCAAATCAGCAAGCAATCAGCCCT 

TTCTCTCCAGAAAGCAGAGAGCTTTTGGCTAGACTTGACGACCCTTTTTACTATGATATA 

CTTGGACCAGCTGATTCTTCTGAGC(^TTGTTCGCTCTCCCTCAGCCGTTCTTCGAGCCT 

TCGCCTGTGCCGAGAAGATGCAGACATGTTTCAAAGGATGAAGAAGCTGATGTTTTCTTA 

GACGATTTCCCAGCTGACATGTTTGATCAGGTTGATCCAATCCCAAGTCCTTAG 

>G2720 Amino Acid Sequence (domain in AA coordinates: 10-114) 

MEAKKEEIKKGPWKAEEDEVLINHVKRYGPRDWSSIRSKGLLQRTGKSCRLRWVNKLR 

LKNGCKFSADEERWIELQSEFGNKWARIATYLPGRTDl^^ 

DASSSSFNPKSSSSHRLKGKNVKPIRQSSQGFGLVEEEVTVSSSCSQMVPYSSDQVGDEV 
LRIjPDLGVKLEHQPFAFGTDLVLAEYSDSQNDANQQAI s pfspesrellarlddpfyydi 
LGPADSSEPLFAIiPQPFFEPSPVPRRCRHVSKDEEADVFLDDFPADMFDQVDPIPSP* 
>G2787 (142 . . 1584) 

TCTGAGAGCAAAAAACAAAAAAAAAGAAAAAAAAACCCTAA^ 
CCTCrTGTCTTTTTTTTTTTTGTTCTTTTTl^^ 

CTCTGCAAAAATCTCACATCCATGGATCCATCTCTTGGTGATCCTCATCATCCTCCTCAG 
TTCACCCCTTTTCCTCATTTTCCCACCTCCAATCATCATCCTTTAGGACCAA 
AATAACCATGTCGTCTTCCAACCGCAGCCGCAAAC^ 
ATGTTTCAGTTATCTCGACATGTTTCAATGCCCC^ 

GCTGCGATTGCGGCGTTAAACGAACCGGATGGTTCGAGCAAGATGGCAATTTCGAGATAC 
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ATCGAGAGATGTTACACCGGTTTAACTTCTGCTCATGCTGCTTTGTTGACTCACCATCTC 
AAGACTTTGAAGACCAGTGGTGTTCTTTCTATGGTTAAGAAATCTTACAAAATTGCTGGT 
TCTTCTACTCCTCCTGCTAGTGTAGCTGTTGCTGCTGCTGCCGCCGCTCAAGGTCTCGAT 
GTTCCCAGATCTGAGATTCTCCATTCAAGTAACAACGATCCCATGGCTTCTGGCTCTGOT 
TCTCAGCCTCTGAAACGAGGTCGTGGTCGTCCTCCTAAGCCTAAACCIX3AATCTCAACCA 
CAACCACTACAGCAACTTCCACCGACCAATCAAGTCCAGGCTAACGGACAGCCAATCTGG 
GAACAGCAGCAAGTTCAATCACCTGTTCCGGTTCCGACTCCGGTTACAGAGTCGGCGAAG 
AGAGGACCTGGTCGTCCAAGGAAGAACGGTTCTGCTGCTCCTGCTACTGCACCAATCGTT 
CAAGCTTCGGTTATGGCTGGAATTATGAAACGTAGAGGTAGACCACCGGGTCGTCGAGCT 
GCTGGGAGACAGAGGAAGCCCAAATCCGTTTCTTCTACTGCCTCTGTGTATCCTTATGTT 
GCTAATGGTGCTAGACGCAGAGGAAGGCCTAGGAGAGTTGTTGACCCTAGCAGTATTGTT 
AGTGTTGCTCCAGTAGGTGGTGAAAATGTGGCAGCGGTTGCGCCAGGGATGAAGCGTGGA 
CGTGGACGACCACCTAAGATTGGTGGTGTTATCAGTAGGCTTATTATGAAGCCTAAGAGA 
GGACGAGGACGTCCTGTAGGTAGACCCAGAAAGATTGGAACATCAGTCACGACTGGGACA 
CAAGATTCTGGAGAACTCAAGAAGAAGTTTGATATTTTTCAAGAGAAAGTGAAAGAAATT 
GTGAAGGTGTTGAAGGATGGAGTTACAAGTGAGAATCAAGCAGTGGTGCAAGCCATAAAA 
GATCTGGAAGCACTAACAGTGACGGAGACCGTTGAGCGAC^ 

CCAGAGGAGACTGCAGCACCACAGACTGAAGCTCAACAAACTGAAGCTGCTGAGACACAA 
GGAGGACAAGAAGAAGGAC^GAAAGAGAAGGAGAAACAC^GACCC^GAC^GAAGCAGAG 
GCAATGCAAGAAGCTCTGTTCTGAAGAATAATAATGATCTAGAAAACAACCTAGACATAA 
TAGCCTTGGTGTTTGGCGTTAGGAGTGTTTTTTTOTAGTTGTTTTAGGTGTTGGAATCGC 
ATCTTAAATTATATAAAAATCTATAAGGAATTTTAATTTTTCTAGGTTTTGTTGTCTGCA 
GAAGAAGAAATAGTAGACTCGTTAATGGTGTTGTTGTCGGTGTGTCTTTAACCAAACCAT 
AAGACGTGGCTGTAAATTAGCGATGTTTCTAGTCTTCCATCTTTAATAATCTCTTATTGC 
GTCTGTGCCTTTGTTTTT 

>G2787 Amino Acid Sequence (domain in AA coordinates: 172-192, 226-247, 256-276 
290-311, 245-366) ' 

MDPSLGDPHHPPQFTPFPHFPTSNHHPLGPNPYWNHVVFQPQPQTQTQIPQPQMFQLSPH 
VSMPHPPYSEMICAAXAALNEPDGSSKMAISR^ 

VLSMV1CKS YKIAGS STPPAS VAVAAAAAAQGLD VPRS E I LHS SNNDPMAS GSASQPLKRG 

RGRPPKPKPESQPQPLQQLPPTNQVQANGQPIWEQQQVQSPVPVPTPVTESAKRGPGRPR 

KNGSAAPATAP IVQAS VMAGIMKRRGRPPGRRAAGRQRKPKS VS STAS VYP YVANGARRR 

GRPRRVVDPSSIVSVAPVGGENVAAVAPGMKRGRGRPPKIGGVISRIiIMKPKRGRGRPVG 

RPRKIGTSVTTGTQDSGELKKK^DIFQEKA^IVKVIiKDGVTSE^QAW 

TETVEPQVMEEVQPEETAAPQTEAQQTEAAETQGGQEEGQEREGETQTQTEAEAMQEALF 
* 

>G2789 (82.. 879) 

CTTTAGGGACACCAAATCTATTCAACCTAAAAGCCT^ 

TTTTAGCGAATCAGAAGAGGAATGGATGAGGTATCTCGTTCTCATACACCGCAATTTCTA 

TCAAGTGATGATCAGCACTATCACCATCAAAACGCTGGACGACAAAAACGCGGCAGAGAA 

GAAGAAGGAGTTGAACCCAACAATATAGGGGAAGACCTAGCCACCTTTCCTTCCGGAGAA • 

GAGAATATCAAGAAGAGAAGGCCACGTGGCAGACCTGCTGGTTCCAAGAACAAACCCAAA 

GCACCAATCATAGTCACTCGCGACTCCGCGAACGCCTTCAGATGTCACGTCATGGAGATA 

ACCAACGCCTGCGATGTAATGGAAAGCCTAGCCGTCTTCGCTAGACGCCGTCAGCGTGGC 

GTTTGCGTCTTGACCGGAAACGGGGCCGTTACAAACGTCACCGTTAGACAACCTGGCGGA 

GGCGTCGTCAGTTTACACGGACGGTTTGAGATTCTTTCTCTCTCGGGTTCGTTTCTTCCT 

CCACCGGCACCACCAGCTGCGTCTGGTTTAAAGGTTTACTTAGCCGGTGGTCAAGGTCAA 

GTGATCGGAGGCAGTGTGGTGGGACCGCTTACGGCATCAAGTCCGGTGGTCGTTATGGCA 

GCTTCATTTGGAAACGCATCTTACGAGAGGCTGCCACTAGAGGAGGAGGAGGAAACTGAA 

AGAGAAATAGATGGAAACGCGGCTAGGGCGATTGGAACGCAAACGCAGAAACAGTTAATG 

CAAGATGCGACATCGTTTATTGGGTCGCCGTCGAATTTAATTAACTCTGTTTCGTTGCCA 

GGTGAAGCTTATTGGGGAACGCAACGACCGTCTTTCTAAGATAATATCATTGATAATATA 
AGTTTCGTCTTCTTATTCTTTTTCACT 

AACGTTTGATTAATACCTGAAGGTTTTTGGAAAATTTTCGATCGGATAAAAGGATTTATG 
TTGCGAGCCGAAACGCGGCC 

>G2789 Amino Acid Sequence (domain in AA coordinates: 53-73, 121-165) 
MDEVSRSHTPQFLSSDHQHYHHQNAGRQKRGREEEGVEPNNIGEDIiATFPSGEENIKKRR 
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PRGRPAGSKNKPKAPI IVTRDSANAFRCHVME ITNACDVMESLAVFARRRQRGVCVLTGN 

GAVTOTTVRQPGGGWSLHGRFEILSLSGSFLPPPAPPAASGLKVYIiAGGQGQVIGGSVV 

GPLTASSPVWMAASFGNASYERLPLEEEEETEREIDGNAARAIGTQTQKQLMQDATSFI 

GSPSNIilNSVSIiPGEAYWGTQRPSF* 

>G31 (13.. 615) 

CTTTTATAAGCAATGGCTCCAAGACAGGCGAACGGTAGAAGCATTGCCGTGAGTGAAGGC 
GGCGGAGGGAAGACGATGACGATGACGACGATGCGGAAGGAAGTGCACTTTAGAGGTGTG 
AGGAAGCGTCCATGGGGTAGATACGCGGCGGAGATCCGTGACCCGGGAAAGAAAACCCGG 
GTTTGGCTCGGGACATTCGA<^CGGCGGAGGAAGCTGCAAGAGCTTACGACACCGCCGCT 
AGAGAGTTTCGTGGCTCCAAAGCZAAAGACTAATTTCCCTCTTCCCGGAGAGTCTACTACG 
GTTAACGACGGTGGCGAGAACGATTCTTACGTCAACCGTACGACGGTGACGACGGCGCGT 
GAGATGACGCGTCAGAGATTTCCGTTTGCATGTCACCGGGAGCGTAAAGTCGTCGGTGGT 
TATGCTTCTGCTGGlU'TTi'TCTTCGATCCGTCAAGAGCTGCTTCGTTAAGAGCAGAGCTT 
TCTCGGGTTTGTCCGGTTCGGTTTGATCCGGTTAATATCGAGTTGAGTATTGGTATTCGA 
GAAACCGTAAAAGTTGAACCGAGAAGAGAACTAAACCTGGATCTTAACCTAGCTCCACCG. 
GTGGTGGACGTTTAGATTTTTTTCTTCTTTTGATAATTTGTATTTTACATTGCCGGAAAA 
TAATTAATGTTTTCTTTAG 

>G31 Amino Acid Sequence (domain in AA coordinates : TBD) 
MAPRQ ANGRS I AVS EGG GGK7TMTMTTMRKEVHFRGVRKRP WGRYAAE I RD PGKKTRVWLG 
TFDTAEEAARAYDTAAREFRGSKAKTNFPLPGESTTVNDGGENDSYVNRTTVTTAREMTR • 
QRFPFACHRERKWGGYT^AGFFFDPSRAASLRAELSRVCPVRFDPVNIELSIGIRETVK 
VEPRRELNLDLNIiAPPVVDV* 
>G33 (20.. 757) 

ATTCTCCCCCAACCAAAATATGACCACAGAAAAAGAGAATGTCACTACGGCCGTGGCCGT 
GAAAGACGGCGGAGAAAAGAGTAAGGAAGTGAGTGACAAGGGCGTAAAGAAGAGAAAGAA 
TGTAACTAAGGCCCTGGCCGTGAATGACGGCGGAGAAAAGAGTAAGGAAGTGCGTTACAG 
GGGTGTAAGGAGGAGACCATGGGGGAGATATGCTGCGGAGATCCGTGATCCGGTAAAGAA 
AAAACGGGTCTGGCTCGGGTCCTTCAACACGGGGGAGGAAGCCGCCAGAGCCTACGACTC 
CGCTGCCATAAGGTTTCGAGGATCGAAAGCTACTACTAACTTCCCTCTAATCGGATACTA 
TGGGATTTCTTCGGCGACGCCGGTGAACAAC71ACCTTTCCGAGACGGTGAGTGATGGAAA 
TGCCAACCTCCCTCTCGTTGGAGACGATGGGAATGCTTTGGCTTCTCCGGTGAACAACAC 
CCTTTCCGAAACGGCGCGTGATGGAACACTTCCAT 

AGAGGAGCTTGATCGAGTTTGTCCTGACCAGTTTGAGTCCATTGATATGGGGTTGACTAT 
TGGTCCTCAAACCGCCGTGGAAGAGCCTGAGACTTCCTCCGCCGTGGATTGTAAGCTGCG 
AATGGAACCGGATCTTGACCTCAACGCAAGTCCCTAAAGATTGATCTGATGTTGTTGTCC 
TTGAATAAGTTTGTTATCTTGTCGCTCTTCTGATTGTCTGTACTTCTATTGGTTGATTCG 
TGCTTTTGGAGGACAAAACAAACATTTTTTTATGTATTAAAAAAAGGTAATTGAACTATT 
ATCGAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

>G33 Amino Acid Sequence (domain in AA coordinates: 50-117) 
MTTEKENVTTAVAVTCDGGEKS KEVSDK KEVRYRGVRRRP 
WGRYAAE IRDPVKKKRVWLGS FNTGEEAARAYDS AAIRFRGS KATTNFPIj I GYYG I SS AT 
PVNNNIiSETVSDGNA]SrLPLVGDDGNAIJVSPVNNTLS 

AGFFLDLPEVIALKEELDRVCPDQFESIDMGLTIGPQTAVEEPETSSAVDCKLRMEPDIiD 
LNASP* 

>G342 (1-.723) 

ATGGACGTCTACGGCATGTCTTCACCGGACTTGCTTCGTATCGACGACCTTCTCGATTTC 
TCCAACGACGAAAT^TTCTCTTCCTCTTCCACCGTCACTTCCTCCGCCGCTTCCTCCGCC 
GCTTCTTCCGAAAACCCTTTCAGCTTTCCTTCTTCCACCTACACTTCTCCTACTCTCCTC 
ACCGACTTCACTCACGATCTCTGCGTTCCCAGTGACGACGCAGCTCATCTCGAATGGTTA 
TCGCGATTCGTTGACGATTCATTCTCCGATTTCCCAGCAAATCCTTTAACCATGACCGTT 
AGACCCGAGATTTCATTCACCGGAAAACCTAGAAGTCGCCGATCAAGAGCACCAGCACCT 
TCCGTAGCTGGAACTTGGGCTCCGATGTCTGAATCAGAGCTTTGTCACTCCGTCGCTAAA 
CCTAAACCGAAGAAAGTCTACAACGCTGAATCGGTTACGGCGGATGGAGCGAGGCGGTGC 
ACGCACTGTGCCTCGGAGAAAACGCCACAGTGGAGAACTGGACCGCTTGGACCTAAAACA 
CTTTGTAACGCTTGTGGAGTTCGTTACAAATCAGGGAGGCTTGTACCGGAATACAGACCG 
GCGTCGAGTCCGACGTTTGTATTGACTCAGCATTCGAACTCTCATCGGAAAGTTATGGAG 
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CTCCGGCGACAGAAGGAACAACAAGAATCTTGCGTTCGAATTCCGCCGTTTCAGCCGCAG 
TAA 

>G342 Amino Acid Sequence (domain in AA coordinates: 155-190) 

MDVYGMSSPDLLRIDDIjIiDFSNDEIFSSSSTVTSSAASSAASSENPFSFPSSTYTSPTLL 

TDFTHDLC^TPSDDAAHLEWLSRFTODSFSDFPANPLTMTV^ 

SVAGTWAPMSESELCHSVAKPKPKKVYNAESOTADGARRC^ 

LCNACGVRYKSGRLVPEYRPASSPTFVLTQHSNSHRKVMELRRQKEQQESC^ 

* 

>G352 (80.. 817) 

AATACACCACACACTTCACTCTTTCTTCATCTTCTTC 

tctcacagaattaaatcttatggctctcgagactctcaattctccaacagctacca 

cgctcggcctcttctccggtatcgtgaagaaatggagcctgagaatctcgagcaatgggc 

taaaagaaaacgaacaaaacgtcaacgttttgatcacggtcatcagaatcaagaaacgaa 

caagaaccttccttctgaagaagagtatctcgctctttgtctcctcatgctcgctcgtgg 

ctccgccgtac^tctcctcctcttcctcctctaccgt'cacgtgcgtcaccgtccgatca 

ccgagattacaagtgtacggtctgtgggaagtccttttcgtcataccaagccttaggtgg 

acac^gacgagtcaccggaaaccgacgaacactagtatcacttccggtaaccaagaact 

gtctaataacagtcacagtaagagcggttccgttgttattaacgttaccgtgaacactgg 

taacggtgttagtcaaagcggaaagattcacacttgctcaatc 

gtctggtcaagccttaggtggacacaaacggtgtc^c^ 

cggtaacggaagtagcagcaacagcgtagaactcgtcgctggtagtgacgtcagcgatgt 
tgataatgagagatggtccgaagaaagtgcgatcggtggccaccgtggatttgacctaaa 
cttaccggctgatcaagtctcagtgacgacttcttaa 

>G352 Amino Acid Sequence (domain in AA coordinates: 99-119,166-186) 
MALETLNSPTATTTARPLLRYREEMEPEl^EQWAKRKR 

EEYLAI»CIiLMIiARGSAVQSPPLPPIjPSRASPSDHRDyKCTVCGKSFSSYQALGGHKTSHR 
KPTNTS ITSGNQELSNNSHSNSGS WI NVTVNTGNGVSQSGKIHTCS ICFKSFASGQALG 
GHKRCHYDGGNNGNGNGS S SNSVELVAGSDVSDVDNERWSEESAI GGHRGFDLNIiPADQV 
SVTTS* 

>G357 (1..615) 

ATGCAGAACAAACACAAATGC^^GCTCTGTTC(^UVGAGTTTCTGTAATGGCAGAGCACTT 
GGTGGTCACATGAAGTCTCACTTGGTCTCATCTGAGTCTTCAGCTCGGAAGAAACTAGGT 
GACTCGGTCTATTCTTCTTCTTCCTCTTCCTCCGATGGTAAAGCGCTCGCCTACGGGTTA 
CGAGAGAACCCGAGGAAGAGTTTCCGGGTCTTTAATCCGGATCCTGAGTCATCCACAATT 
TACAACAGTGAGACAGAGACCGAACCTGAATCCGGAGACCCGGTTAAGAAACGGGTCAGA 
GGAGATGTTTCAAAGAAGAAGAAGAAGAAGGCAAAGAGTAAGAGAGTGTTTGAGAACTCG 
AAGAAGCAAAAGACAATTCACGAGTCACCAGAACCAGCGAGTTCTGTCTCTGATGGTTCT 
CCTGAACAAGATTTAGCTATGTGCTTGATGATGCTGTCAAGAGATTCAAGGGAGCTCGAG 
ATTAAACTGAAAAAACCGGAGGAAGAGAGGAAGCCGGAAAAAAGACATTTCCCTGAGCTC 
CGTCGCTGTATGATAGATCTGAATCTTCCTCCGCCGCAAGAAGCTGAAGCTGTCACCGTC 
GTTTCAGCCATATAA 

>G357 Amino Acid Sequence (domain in AA coordinates: 7-29) 
MQNKHKCKLCSKS FCNGRALGGHMKSHLVS S QS SARKKLGD SVYS S S S S SSDGKALAYGL 
RENPRKSFRVFNPDPESSTIYNSETETEPESGDPVKKRVRG 

KKQKTIHESPEPASSVSDGSPEQDIiAMCLMMLSRDSRELEIKIjKKPEEERKPEKRHFPEL 
RRCMIDLNLPPPQEAEAVTWSAI * 
>G358 (1..855) 

ATGGGTCAAGATGAGGTTGGGAGTGATCAGACGCAAATCATAAAAGGGAAACGTACGAAG 
CGACAAAGATCGTCTTCGACGTTTGTGGTGACGGCGGCGACAACAGTGACTTCAACAAGT 
TCATCGGCCGGTGGAAGTGGAGGAGAAAGAGCTGTTTCAGATGAATACAACTCGGCGGTT 
TCGTOTCCGGTGACTACTGATTGTACGCAAGAAGAAGAAGACATGGCGATTTGTCTCATC 
ATGl^AGCTCGTGGGACAGTTCTTCCATCGCCGGATCTCAAGAACTCGAGAAAAATTCAT 
CAGAAGATTTCGTCGGAGAATTCTAGTTTCTATGTGTACGAGTGTAAAACGTGTAACCGG 
ACGTTTTCGTCGTTCCAAGCACTTGGTGGACACAGAGCGAGCCACAAGAAGCCGAGGACG 
TCGACTGAGGAAAAGACTAGACTACCCCTGACGCAACCCAAGTCTAGTGCATCAGAAGAA 
GGGCAAAACAGTCATTTCAAAGTTTCCGGCTCAGCCCTAGCTTCACAGGCAAGTAACATC 
ATCAACAAGGCAAACAAAGTACACGAGTGTTCCATCTGCGGTTCTGAGTTCACTTCCGGG 
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CAAGCTCTCGGTGGTCACATGAGGCGGCACAGGACAGCCGTAACCACGATTAGCCCCGTT 

GCAGCCACCGCAGAAGTAAGCAGAAACAGTACAGAGGAAGAGATTGAGATCAATATAGGC 

CGTTCGATGGAACAGGAGAGGAAATATCTACCGTTGGATC^ 

GATGATCTAAGAGAGTCAAAGTTTCAAGGGATAGTATTCTCAGC^^ 

GATTGTCATTACTAG 

>G358 Amino Acid Sequence (domain in AA coordinates: 124-135, 188-210) 
MGQDEVGSDQTQ 1 1 KGKRTKRQRSS STFWTAATTVTSTS S S AGGSGGERAVSDE YNSAV 
SSPVTTDCTQEEEDMAICLIMLJ^^ 

TFSSFQALGGHRASHKKPRTSTEEKTRLPLTQPKSSASEEGQNSH^ 

INKANKVHECS I CGS EFTSGQAIiGGHMRRHRTAVTTI S PVAATAEVS RNSTEEE I E INIG 

RSMEQQRKYIiPLDLNLPAPEDDLRESKFQGIWSATPALIDCHY* 

>G360 (1..543) 

ATGTGGAACCCTAACAAAATTGAAGAATTGGAGGATGATGATGAATCTTGGGAAGTCAAA 
GCCTTTGAGCAAGACACTAAAGGCAACATCTCTGGTAC CACTTGGCCTCCAAGATC TTAC 
ACTTGCAATTTCTGCCGCCGTGAGTTCCGTTCTGCTCAAGCCTTAGGCGGTCACATGAAT 
GTCC^CCGCCGTGACCGCGCCTCATCTAGGGCTCATCAAGGTTCCACCGTTGCGGCTGCG 
GCTAGAAGCGGCCACGGGGGGATGTTACTCAATTCTTGTGCTCCGCCGTTGCCTACAACG 
ACACTTATAATACAATC CACGGCGAGTAACATTGAAGGTTTGTC CCATTTCTACCAACTG 
CAAAACCCTAGTGGC^TTTTTGGTAATTCTGGTGACATGGTGAATCTTTATGTAGAAGTT 
CCTCCTCGGCTTATTGAATATTCGACAGGAGATGATGAGAGCATTGGCTCGATGAAAGAA 
GCGACAGGAACATCAGTGGATGAGCTTGATCTTGAACTTCGGCTAGGGCACCATCCACCG 
TGA 

>G360 Amino Acid Sequence (domain in aa coordinates: 42-62) 
MWNPNKIEEIiEDDDE S WEVKAFEQDTKGNI SGTTWPPRS YTCNFCRRE FRS AQALGGHMN 
VHRRDRASSRAHQGSWAAAARSGHGGMLLNSCAPPLPTTTLIIQSTASNIEGLSHFYQL 
QNPSGI FGNS GDMVNLYVEVPPRL I E YSTGDDES IGSMKEATGTS VDELDLELRLGHHPP 
* 

>G362 (195.. 830) 

ATAAAAAACCCTTCATACAATATAAAATTTCTTTAGACATACAATATATTATACTATTAC 
AGATGCAATGCATCATTAGTTACAAACTATTAAACTAAATATCCCCCGTCTCTCTCTTGC 
TATATAAAGAAGATCATTTACACATCTCCTTAAGCAAATTAAAC C CATCGATAAACACAT 
ACGTTCACACATATATGTCTATAAATCCGACAATGTCTCGTACTGGCGAAAGTTCTTCAG 
GTTCGTCCTCCGACAAGACGATAAAGCTATTCGGCTTCGAACTCATCAGCGGCAGTCGTA 
CGCCGGAAATCACGACGGCGGAAAGCGTGAGCTCGTCCACAAACACGACGTCGTTAACAG 
TGATGAAAAGACACGAGTGCCAATACTGCGGTAAAGAGTTTGCAAATTCTCAAGCCTTAG 
GAGGTCACCAAAACGCTCACAAGAAGGAGAGGTTGAAGAAGAAGAGGCTTCAGCTTCAAG 
CTCGGCGAGCCAGCATCGGCTATTATCTCACCAACCACCAACAACCCATAACGACGTCAT 
TTCAGAGACAATACAAAACGCCGTCGTATTGTG<^TTCTCCTCCATGCACGTGAATAATG 
ATCAGATGGGTGTGTACAACGAAGATTGGTCGTCGAGGTCGTCGCAGATTAACTTCGGTA 
ATAATGACACGTGCCAAGATCTTAATGAACAAAGCGGTGAGATGGGTAAGCTGTACGGTG 
TTCGACCGAACATGATTCAGTTCCAGAGAGATCTGAGTTCTCGTTCTGATCAGATGAGAA 
GTATTAACTCGCTGGATCTTGATCTAGGTTTTGCCGGAGATGCGGCATAACAAATTAAAG 
AGAGATATATGATTAAGATTATATGTACTATAGTGGCGTATTTCATTGGGATCATGAAGG 
GGAAAAAACGAGACATATAGTATTCTTGATGCAATTTGAGTTTTGTAATTTATTTAGGTT 
TATGTATGTTTTCGAAG 

>G362 Amino Acid Sequence (domain in AA coordinates: 62-82) 

MSINPTMSRTGESSSGSSSDKTIKLFGFELISGSRTPEITTAESVSSSTNTTSIjTVMKRH 

ECQYCGKEFANSQALGGHQNAHKKERLKKKRLQLQARRAS IGYYIiTNHQQPITTS FQRQY 

KTPSYCAFSSMHVimDQMGVYiraDWSSRSSQINFGN^ 

IQFQRDLSSRSDQMRS INSLDLHLGFAGDAA* 

>G364 (64 . .516) 

AAGCTTGATATCGCCTCTCTCTAATCTCTCTTTCTCTCTCTATCTCTAAGAATATATAAA 
GGTATGGACTACCAGCCAAACACATCCCTACGTCTAAGCCTACCAAGTTACAAGAACCAC 
CAACTAAACCTAGAACTTGTTCTCGAGCCTTCTTCCATGTCTTCTTCTTCATCTTCTTCC 
ACGAACTCATCATCATGTTTGGAGCAGCCT 

AAGTTTTAC^GCTCTCAAGCTCTTGGTGGT(^TCAAAACGCTCATAAGCTTGAGAGAACC 
TTAGCCAAGAAGAGTCGAGAACTCTTTAGATCCTCAAACACTGTTGATTCTGATCAGCCT 
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TACCCGTTCTCCGGTCGCTTTGAGCTTTACGGCCGTGGCTACCAAGGATTTCTCGAAAGT 
GGCGGCTCGAGGGACTTCTCCGCCCGCCGTGTGCCGGAGAGTGGTCTTGATCAGGATCAG 
GAGAAGAGTCACCTTGACTTATCCTTAAGGCTCTAAAAGAATCTTATATTTTGTTAGTCT 
ATATATTATCATATCAATTGTTAATCTTAAAATTGATTGTTTTACTTATTAGTCATTTCC 
TATTATCTGAAAGTTTTCTTTGTAAGTTGTAACTATGGTCCTAAATTCAAATCCAAATTT 
GATTTTGGAAGATGGTACCTAATGCAGTAGTTAAATAAGTTAAAAAAATGAAGGATCTAT 
AATTCTCT 

>G364 Amino Acid Sequence (domain in AA coordinates: 54-76) 
MDYQPNTSLRLSIiPSYKNHQLNL^ 

FYSSQALGGHQNAHKLERTLAKKSRELFRSSNTVDSDQPYPFSGRFELYGRGYQGFLESG 

GSRDFSARRVPESGLDQDQEKSHLDLSLRIi* 

>G365 (69. .755) 

CAATTCTTTTACTTTCATTCTCTTTATATATTCTCTCTACGCTATAATATATATTACAC^ 
GAATATACATGGAACCGTCCATCAAAGGAGATCAAGAAATGTTAAAAATCAAGAAAC^G 
GTC^TOVAGATCTTGAGTTGGGGTTGACCCTTTTGTCACGTGG 

AGCTCAATCTCATCGATTCTTTCAAAACCAGCTCATCATCGACTTCTCATCATCAGCACC 

AGCAAGAACAATTGGCAGATCCGAGAGTGTTCTCGTGTAATTATTGTCAAAGAAAGTTCT 

ATAGTTCACAAGCGCTAGGCGGTCACCAAAACGCTCATAAACGTGAGCGCACCTTAGCCA 

AACGTGGACAGTATTACAAGATGACTCTCTCCTCCTTGCCTTCTTCAGCGTTTGCGTTTG 

GCCACGGTTCAGTCAGCAGATTCGCAAGCATGGCATCGTTACCATTACATGGCTCGGTGA 

ATAACAGGTCAACGTTAGGGATTCAAGCTCATTCAACGATCCATAAGCCCA.GCTTCTTAG 

GAAGACAAACGACGAGTTTAAGTCATGTTTTCAAACAGAGCATTCACCA 

TAGGAAAGATGTTGCCGGAGAAATTTCACCTTGAAGTCGCCGGAAATAATAACAGTAACA 

TGGTTGCTGCTAAGTTGGAGAGAATTGGACATTTCAAGAGCAACCAAGAAGATCATAATC 

AGTTTAAGAAAATTGACTTGACTCTTAAGCTATGAGCTCTGCCATCTTCTTTTTAGTCTT 

CATTATAACTTTTTTTATTCTCATCTTTGTTTGATATAATGATTGACGGCAGGGTGTGTT 

AGAGTTTCACTAATGATCAAGTTGTACTTTTTATATATTTCATTGATACCTTGTTGATGT 

AATTCAATATTTTAGGTCTGTTTTT 

>G365 Amino Acid Sequence (domain in aa coordinates: 70-90) 

MEPSIKGDQEMLKIKKQGHQDLELGLTLLSRGTATSSELNLIDSFKTSSSSTSHHQHQQE 

QLADPRVFS CNYCQRKFYS S QALGGHQNAHKRERTLAKRGQYYKMTLS SLPS SAFAFGHG 

SVSRFASimSIiPLHGSVNNRSTLGIQAHSTIHKPSFLGRQTTSLSHVFKQSIHQKPTIGK 

MLPEKFHLEVAGJmNSNMVAAKLERIGHFKSNQEDHNQF^ 

>G367 (1. .708) 

ATGGACGCTTCAATAGTTTCCTCATCCACTGCTTTTCCATATCAAGATTCTCTAAACCAG 
AGCATCGAAGACGAAGAAAGAGACGTTCATAATTCTAGTCACGAACTCAATCTCATCGAC 
TGCATAGACGACACAACGAGTATCGTTAACGAATCTA^^ 

TTCTCATGCAACTATTGTCAAAGAACTTTCTATAGCTGACAAGGACTTGGTGGTCA 

AACGCACACAAGAGAGAGAGAACGTTGGCGAAGAGAGGACAACGTATGGCAGCGTCAGCC 

TCAGCTTTTGGACATCCTTACGGTTTCT 

CATAGGTCTTTAGGGATCCAAGCGC^TTCGATAAGCCACAAGCTAAGTTCTTATAACGGG 

TTTGGTGGTCACTATGGTCAGATCAACTGGTCAAGACTTCCATTTGATCAACAACCAGCC 

ATAGGTAAATTTCCCTCAATGGATAATTTTCATCATCATCATCATCAGATGATGATGATG 

GCTCCTTCAGTAAATTCACGGTCCAATAACATCGATAGCCCAAGCAACACAGGACGGGTT 

CTAGAAGGGTCACCGACTCTTGAACAATGGCACGGAGACAAAGGATTGTTGTTAAGTACA 

AGTCATCATGAAGAGCAGCAGAAACTTGACTTGTCCCTCAAGCTTTGA 

>G367 Amino Acid Sequence (domain in AA coordinates : 63-84) 

I^ASIVSSSTAPPY^DSLNQSIEDEERDVHNSSHELNLIDCIDDTTSIVNESTTSTEQKL 

FSCNYCQRTF YSS QALGGHQNAHKRERTLAKRGQRMAASAS AFGHPYGFS PLPFHGQYNN 

HRSLGIQAHS I SHKLS SYNGFGGHYGQINWSRLPFDQQPAXGKFPSMDNFHHHHHQMMMM 

APSWSRSNNIDSPSNTGRVLEGSPTIjEQWHGDKGLIjLSTSHHEEQQKXiDLSLKL* 

>G373 (1..1854) 

ATGGCGATTGAAACTCAGCTTCCTTGCGACGGTGACGGTGTGTGTATGCGGTGTCAGGTG 
AATCCTCCGTCAGAAGAGACTCTCACTTGTGGCACGTGCGTCACTCCATGGCACGTGCCG 
TGTCTCCTCCCCGAATCACTCGCTTCTTCCACTGGAGAGTGGGAGTGTCCCGATTGCTCC 
GGCGTTGTCGTTCCCTCCGCCGCTCCGGGTACCGGAAACGCTCGACCTGAATCTTCCGGT 
TCAGTTCTCGTTGCTGCGATCCGTGCGATTCAGGCTGATGAGACTTTAACCGAAGCTGAG 
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AAAGCCAAAAAAAGGCAGAAACTGATGAGTGGGGGTGGTGACGATGGTGTCGATGAAGAA 

GAGAAGAAGAAGTTAGAAATCTTTTGTTCTATTTGCATTCAATTGCCAGAAAGACCTATC 

ACGACACCGTGTGGGCACAATTTCTGTTTGAAATGTTTCGAGAAATGGGCAGTAGGTCAA 

GGGAAGCTAACTTGTATGATATGCCGAAGCAAAATTCCGAGACATGTGGCAAAAAATCCT 

CGCATCAACTTAGCTCTAGTTTCTGCTATTCGTTTAGCAAATGTTACCAAATGTTCTGTT 

GAGGCAACTGCAGCCAAGGTTCATCATATTATCCGCAACCAAGACCGTCCTGAGAAAGCA 

TTTACTACCGAGCGGGCAGTAAAAACTGGGAAAGCTAATGCTGCTAGCGGTAAGTTTTTT 

GTGACAATACCTCGTGATCATTTTGGTCCCATACCAGCTGAGAATGATGTCACTAGAAAG 

CAAGGTGTTTTGGTTGGAGAATCTTGGGAGGACAGGCAAGAGTGTAGGCAGTGGGG 

GATTTCCCGCATATTGCTGGCATTGC^ 

CTCTCTGGAGGTTATGACGATGATGAGGATCATGGTGAATGGTTTCTCTACACAGGAAGT 

GGTGGAAGGGATCTCAGTGGAAACAAAAGAATTAACAAGAAACAGTCGTCTGACCAGGCG 

TTTAAAAACATGAATGAATCTCTAAGACTTAGTTGCAA7UVTGGGCTATCCTGTCCGAGTT 

GTCAGGTCTTGGAAGGAGAAGCGTTCTGCATATGCCCCTGCTGAAGGTGTGAGATATGAT 

GGGGTCTATCGAATTGAGAAGTGCTGGAGTAATGTTGGAGTACAGGGTTCTTTTAAGGTC 

TGTCGTTACCTGTTTGTTAGATGTGACAATGAGCCAGCTCCATGGACCAGTGATGAGCAT 

GGCGATCGTCCAAGACCGITGCCTAATGTTCCGGAGCTrGAGACTGCTGCTGACCTGTTT 

GTGAGAAAGGAGAGTCCATCATGGGATTTCGATGAAGCTGAGGGTCGTTGGAAATGGATG 

AAGTCTCCTCCTGTTAGCAGAATGGCTTTGGATCCTGAGGAGAGGAAGAAGAATAAGAGA 

GCAAAAAATACTATGAAGGCCAGACTTCTGAAAGAATTTAGTTGCCAAATCTGTCGGGAA 

GTGCTGAGTCTTCCAGTGACGACGCCTTGTGCACACAACTTCTGCAAAGCATGCTTAGAA 

GCGAAGTTTGCTGGGATAACTCAACTGAGAGAGAGAAGCAATGGCGGACGTAAACTACGT 

GCAAAGAAGAACATCATGACCTGCCCTTGCTGCACGACGGATCTCTCCGAGTTTCTCCAA 

AACCCGCAGGTGAACAGAGAGATGATGGAGATAATAGAGAATTTTAAGAAGAGTGAGGAA 

GAGGCTGATGCATCCATTTCTGAAGAAGAAGAAGAAGAATCCGAACCTCCAACTAAGAAG 

ATTAAGATGGATAACAACTCTGTTGGTGGTAGTGGTACAAGTCTCTCAGCTTAA 

>G373 Amino Acid Sequence (domain in AA coordinates: 129-168) 

MAIETQLPCDGDGVCMRCQVNPPSEETIiTCGTCVTPWHVPCLLPESIiASSTGEWECPDCS 

GVVVPSAAPGTGNARPES SGSVLVAAII^IQADETLTEAEKAKKRQKLMS GGGDDGVDEE 

EKKKLEIFCSICIQIiPERPITTPCGHNFCLKCFEKWAVGQGKLTCMICRSKIPRHVA^ 

RINLALVSAIRLANVTKCS VEATAAKVHHI IRNQDRPEKAFTTERAVKTGKANAASGKFF 

VTIPRDHFGP I PAENDVTRKQGVIiVGES WEDRQECRQWGAHFPH I AGIAGQS AVGAQSVA 

LSGGYDDDEDHGEWFLYTGSGGRDL SGNKRINKKQS SDQAFKNMNES LRL S CKMGYPVRV 

VRSWKEKRSAYAPAEGVRYDGVYRIEKCWSNVGVQGSFKVCRYIiFVRCDNEPAPWTSDEH 

GDRPRPLPNVPELETAADLFVRKESPSWDFDEAEGRWK^KSPPVSRMALDPEERKKNKR 

AJO^n?MKARLLKEFSCQICREVIjSLPVTTPCAHNFCKACLEAKFAGITQLRERSNGGRKLR 

AKKNIMTCPCCTTDLSE FLQNPQVNREMME I IENFKKSEEEADASISEEEEEESEPPTKK 

IKMDNNSVGGSGTSLSA* 

>G396 (1..957) 

ATGGGGGAAAGAGATGATGGGTTGGGTTTGAGTCTAAGCTTGGGAAATAGTCAACAAAAA 
GAACCATCTCTGAGGTTGAATCTTATGCCGTTGACAACTTCTTCTTCTTCTTCTTCGTTT 
CAACACATGCACAATCAGAATAACAATAGCCATCCCCAGAAGATTCATAACATCTCTTGG 
ACTCATCTGTTTCAATCTTCTGGGATTAAACGTACAACTGCAGAGAGAAACTCCGACGCC 
GGGTCATTTCTAAGAGGTTTCAACGTGAACAGAGCTCAGTCTTCGGTGGCGGTAGTGGAC 
TTGGAAGAAGAAGCCGCCGTCGTCTCGTCTCCAAACAGCGCCGTTTCGAGTCTGAGTGGA 
AATAAAAGGGATCTTGCGGTGGCGAGAGGAGGAGATGAAAACGAGGCGGAGAGAGCTTCT 
TGCTCACGCGGAGGGGGAAGCGGTGGTAGCGACGATGAAGACGGCGGAAACGGCGACGGA 
TCAAGGAAGAAACT-ftCGGTTATCGAAGGATCAAGCTCTTGTTCTCGAGGAGACTTTTAAA 
GAACATAGCACTCTTAATCCGAAGCAAAAGCTGGCTCTAGCAAAACAGTTGAATCTAAGG 
GCAAGACAAGTTGAAGTGTGGTTTCAGAACCGTAGGGCAAGGACGAAGCTGAAACAAACG 
GAGGTTGATTGTGAGTATTTAAAGAGATGTTGCGATAATCTGACCGAGGAGAATCGACGG 
CTGGAGAAAGAAGTGTCGGAGCTGAGGGCGTTGAAGTTGTCTCCACATCTCTACATGCAC 
ATGACTCCTCCTACTACTCTCACCATGTGCCCTTCTTGCGAACGTGTCTCCTCCTCTGCC 
GCCACTGTGACCGCTGCTCCTTCCACTACTACTACTCCTACGGTGGTGGGGCGGCCAAGT 
CCACAGCGATTAACTCCTTGGACTGCTATTTCT 

>G396 Amino Acid Sequence (domain in AA coordinates: 159-220) 
MGERDDGLGIjSLSLGNSQQKEPSIjRIjNLMPLTTSSSSSSFQHMHNQNI^SHPQKIHNISW 
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THLFQSSGIKRTTAERNSDAGSFLRGFNVNRAQSSVAVVDLEEEAAVVSSPNSAVSSLSG 

NKM)LAVARGGDENEABRASCSRGGGSGGSDDEDGGNGDGSRKI^ 

EHSTLNPKQKIjALAKQIiNLRARQVEWFQNRRART 

LQKEVSELRALKLSPHLYMHMTPPTTLT^ 

PQRLTPWTAISLQQKSGR* 

>G431 (1..1149) 

ATGGAGAGTGGTTCCAACAGCACTTCTTGTCCAATGGCTTTTGCCGGGGATAATAGTGAT 
GGTCCGATGTGTCCTATGATGATGATGATGCCGCCCATCATGACATCACATCAACATCAT 
GGTCATGATCATCAACATCAACAACAAGAACATGATGGTTATGCATATCAGTCACACCA 
CAACAAAGTAGTTCCCTTT^TCTTCAATCAC^^ 

GTTGCTTCTTCTTCTTCTCCTTCCTCTTGTGCTCCTGCCTATTCTCTAATGGAGATCCAT 

CATAACGAAATCGTTGCAGGAGGAATCAACCCTTGCTCCTCTTTCTCTTCTTCAGCCTCT 

GTCAAGGCCAAGATCATGGCTCATCCTCACTACCACCGCCTCTTGGCCGCTTATGTCAAT 

TGTCAGAAGGTTGGAGCACCACCGGAGGTTGTGGCGAGGCTGGAGGAGGCATGCTCGTCT 

GCCGC^GCCGCAGCCGCATCTATGGGGCCAACAGGGTGTCTTGGTGAAGATCC^ 

GATCAATTCATGGAAGCTTACTGTGAAATGCTCGTTAAGTATGAGCAAGAGCTCTCCAAA 

CCTTTCAAGGAAGCTATGGTCTTCCTTCAACGTGTCGAGTGTCAATTCAAATCCCTCTCT 

CTATCCTCACCTTCCTCTTTCTCCGGTTATGGAGAGACAGCAATTGATAGGAACAATAAT 

GGGTCATCCGAGGAAGAAGTCGATATGAACAATGAATTTGTAGATCCACAAGCTGAGGAT 

AGAGAGCTTAAAGGACAGCTCTTGCGCAAGTACAGTGGTTACTTAGGGAGCCTCAAGCAA 

GAGTTCATGAAGAAGAGGAAGAAAGGAAAGCTCCCTAAAGAAGCTCGTCAACAACTGCTT 

GATTGGTGGAGCCGTCACTACAAATGGCCTTACCCTTCGGAGCAACAAAAGCTCGCCCTT 

GCGGAATCAACGGGGCTGGACCAGAAACAGATAAACAATTGGTTCATAAACCAGAGGAAA 

CGGCATTGGAAGCCGTCGGAGGACATGCAGTTTGTAGTAM 

CATTACTTGATGGATAATGTCTTGGACAATC^ 

ATGCTTTGA 

>G431 Amino Acid Sequence (domain in AA coordinates: 286-335) 

MESGSNSTSCPMAFAGDNSDGPMCPMMMMMPPIMTSHQHHGHDHQHQQQEHDGYAYQSHH 

QQSSSLFLQSLAPPQGTKNKVASSSSPSSCAPAYSLMEIHHNEIVAGGINPCSSFSSSAS 

VKAKIMAHPHYHRLLAAYWCQKVGAPPEWARI.EEACS SAAAAAASMGPTGCLGEDPGIi 

DQFMEAYCEMLVKYEQELSKPFKEAIWFLQRv^ 

GSSEEEVDIVttJKEFvIJPQAEDRELKGQ 

DWWSRHYKWPYPSEQQKI1AI1AESTGIODQKQINNWFTO 

HYFMDNVLDNPFPMDHI SSTML * 

>G479 (1. .1128) 

ATGGAGATGGGTTCCAACTCGGGTCCGGGTCATGGTCCGGGTCAGGCAGAGTCGGGTGGT 
TCCTCCACTGAGTCATCCTCTTTCAGTGGAGGGCTCATGTTTGGCCAGAAGATCTACTTC 
GAGGACGGTGGTGGTGGATCCGGGTCTTCTTCCTCAGGTGGTCGTTCAAACAGACGTGTC 
CGTGGAGGCGGGTCGGGTCAGTCGGGTCAGATACCAAGGTGCCAAGTGGAAGGTTGTGGG 
ATGGATCTAACCAATGCAAAAGGTTATTACTCGAGACACCGAGTTTGTGGAGTGCACTCT 
AAAACACCTAAAGTCACTGTGGCTGGTATCGAACAGAGGTTTTGTCAACAGTGCAGCAGG 
TTTCATCAGCTTCCGGAATTTGACCTAGAGAAAAGGAGTTGCCGCAGGAGACTCGCTGGT 
CATAATGAGCGACGAAGGAAGCCACAGCCTGCGTCTCTCTCTGTGTTAGCTTCTCGTTAC 
GGGAGGATCGCACCTTCGCTTTACGAAAATGGTGATGCTGGAATGAATGGAAGCTTTCTT 
GGGAACCAAGAGATAGGATGGCCAAGTTCAAGAACATTGGATACAAGAGTGATGAGGCGG 
CCAGTGTCGTCACCGTCATGGCAGATCAATCCAATGAATGTATTTAGTCAAGGTTCAGTT 
GGTGGAGGAGGGACAAGCTTCTCATCTCCAGAGATTATGGACACTAAACTAGAGAGCTAC 
AAGGGAATTGGCGAGTCAAACTGTGCTCTCTCTCTTCTGT 

GACAACAACAACAACAACAACAACAACAGCAACAA.CAACAACAATACATGGCGAGCTTCT 
TCAGGTTTTGGCCCGATGACX3GTTACAATC 

CAGTATCTGAACCCGCCTTGGGTATTCAAGGACAATGATAATGATATGTCTCCTGTTTTG 
AATTTAGGTCGATACACCGAGCCAGATAATTGTCAGATAAGTAGTGGCACGGCAATGGGT 
GAGTTCGAGTTATCTGATCACCATCATCAAAGTAGGAGACAGTACATGGAA 
ACAAGGGCTTATGACTCTTCTTCTCACCATACCAACTGGTCTCTCTGA 

>G479 Amino Acid Sequence (conserved domain in AA coordinates : 70-149) 

MEMGSNSGPGHGPGQAESGGSSTESSSFSGGLMFGQKIYFEDGGGGSGSSSSGGRSNRRV 

RGGGSGQSGQIPRCQV^GCGiynDLTNAKGYYSRmVCGvllSKTPKvTVAGIEQRFCQQCSR 
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FHQLPEFDLEKRS CRRRI*AGHNERRRKPQPAS LS VLASRYGRI APS L YENGDAGMNGS FL 
GNQEIGWPSSRTIJDTRVMRRPVSSPSWQINP^T^TWSQGSVGGGGTSFSSPEIMDTKLESY 
KG IGDSNCALSLLSNPHQPHDNNNl^^ 

QYLNPPWVFKDNDNDMSPVLNLGRYTEPDNCQISSGTAMGEFELSDHHHQSRRQY^ 

TRAYDSSSHHTNWSL* 

>G546 (1..588) 

atgactcgaccgtcaagattacttgagacggcggcgccaccaccacaaccgtcggaggag 
atgatcgcagcggaatccgacatggtggtgatcttgtcggctcttctttgcgctcttatc 
tgcgttgctggtctcgccgccgtcgtacgatgcgcttggctccggcggtttacagccgga 
ggagattcgccgtcaccgaacaaaggcttgaaaaagaaagctcttcagtctcttccaaga 
tccactttcaccgccgcggaatcaacctccggcgccgccgctgaagagggagactcgacg 
gaatgtgctatttgcctcactgacttcgccgacggtgaagaaataagagtgcttcctctt 
tgtggtcattctttccacgtggagtgtattgacaaatggctagtttctaggtcttcttgt 
ccttcttgtcgcaggattcttacgccggtgagatgtgaccggtgtggtcatgcttctacg 
gcggagatgaaagatcaagctcatcgtcatcaacatcaccaacactcttctactaccatt 
cctacgtttcttccttaa 

>G546 Amino Acid Sequence (domain in AA coordinates : 114-155) 

MTRPSRLLETAAPPPQPSEEMIAAESDMWILSALLC^^ 

GDSPSPNKGI4KKKALQSLPRSTFTAAESTSGAAAEEGD 

CGHS FHVEC IDKWI4VSRS S CPS CRRI LTPVRCDRCGHASTAEMKDQAHRHQHHQHS STTI 
PTFLP* 

>G551 (1..708) 

ATGGAGTGGTC^C^CGAGC^ACGTAGAAAACGTGAGAGTAGCTTTC^TGCCACCGCC^ 

TGGCCGGAGTCTAGTTCCTTTAACTCGCTCCACAGCTTCAACTTTGATCCTTACGCAGGA 

AATTCATATACGCCTGGCGATACACAAACCGGACCGGTTATCTCTGTACCGGAATCAGAA 

AAGATCATGAATGCGTACCGATTTCCGAACAACAAG^ 

CTAACGAGTGGACAATTAGCTTCACTTGAGCGAAGTTTTCAAGAAGAGAT 

TCAGACAGGAAGGTGAAGCTGTCGAGAGAGCTCGGTCTGCAGCCACGTCAGATAGCAGTT 

TGGTTCCAAAACCGCCGTGCACGGTGGAAGGCGAAGCAGCTTGAGCAGTTGTACGACTCG 

CTTAGACAAGAGTACGACGTCGTTTCTAGGGAGAAACAAATGTTACACGATGAGGTGAAG 

AAGCTGAGAGCTTTACTAAGAGACCAGGGTTTGATCAAGAAGCAAATCTCTGCCGGGACC 

ATCAAAGTTTCCGGTGAGGAAGACACGGTGGAGATTTCATCGGTGGTGGTAGCTCATCCA 

AGAACGGAGAATATGAACGCAAATCAAATCACCGGAGGGAATCAAGTTTACGGTCAATAC 

AACAATCCGATGCTGGTTGCTTCCTCTGGCTGGCCGTCATACCCCTGA 

>G551 Amino Acid Sequence (conserved domain in AA coordinates : 73-133) 

MEWSTTSNVFINV^VAPMPPPWPESSSFNSIjHSFNFDPYAGNSYTPGDTQTGPVISVPESE 

KI MNAYRFPNUWNEM I KKKRLTS GQLASLERS FQEE I KLDSDRKVKIiSREIjGLQPRQ I AV 

WFQNRRARWKAKQLEQIiYDSLRQEYDWSREKQMLHDEVKKLRALL^ 

I KVS GEEDTVE I S S VVVAHPRTENMNANQI TGGNQ VTGQYWNPMLVAS S GWPS YP * 

>G578 (1..978) 

ATGCATAGTTTGAATGAAACAGTAATTCCTGATGTTGATTACATGCAGTCTGATAGAGGG 
CATATGCATGCTGCTGCCTCTGATTCCAGTGATCGATCAAAGGATAAGTTGGATCAAAAG 
ACCCTTCGTAGGCTTGCTCAAAATCGTGAGGCAGCAAGAAAAAGCAGATTGAGGAAGAAG 
GCGTATGTTCAGCAGCTGGAAGATAGTCGATTAAAGCTGACTCAAGTTGAGCAGGAGCTG 
CAAAGAGCAAGACAGCAGGGAGTTTTCATCTGAA 

GGTGGCAATGGTGGGGCTTTGGCATTTGATGCAGAACACTCACGATGGCTTGAAGAAAAG 
AACAGGCAAATGAACGAGCTGAGATCTGCCCTGAATGCTCATGCAGGTGATACTGAGCTC 
CGGATAATTGTGGAffGGAGTGATGGCTCACTATGAGGAGCTTTTCAGGATTAAGAGCAAT 
GCATCTAAGAATGATGTCTTCCACTTGTTATCTGGAATGTGGAAAACACCAGCTGAGCGA 
TGTTTCTTGTGGCTTGGCGGGTTCCCGTCATCCGAACTTCTCAAGCTTCTTGCGAATCAG 
CTAGAGCCCATGACAGAACGACAGGTAATGGGCATCAATAGCTTGCAGCAGACGTCGCAG 
CAGGCAGAAGATGCTTTATCTCAAGGGATGGAGAGTTTACAGCAATCCCTAGCTGATACT 
TTATCCAGTGGAACTCTTGGTTCCAGTTCATCGGATAATGTCGCGAGCTACATGGGTCAG 
ATGGCCATGGCAATGGGCAAGTTAGGCACCCTCGAAGGATTCATACGCCAGGCTGATAAC 
TTGAGGCTGCAAACACTAC^^CAGATGCTTCGAGTATTAACAACACGTCAGTCAGCTCGT 
GCTCTTCTTGCTATACACGATTATTCATCTCGATTACGTGCTCTTAGTTCCTTGTGGCTT 
GCCCGGCCAAGAGAGTGA 
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>G57 8 Amino Acid Sequence (domain in AA coordinates 3 6-96) 

MHSLNETVIPDVDYMQSDRGHMHAAASDSSDRSKDKLDQKTLRRIiAQNRE 

AYVQQLEDSRLKLTQVEQELQRARQQGVFISSSGDQAHSTGGNGGALAFDAEHSRWLEEK 

NRQMNELRSALNAHAGDTELRI IVDGVMAHYEELFRIKSNASKNDVFHLLSGMWKTPAER 

CFLWLGGFPSSELLKLIJUIQLEPMTERQVMGINSLQQTSQQAEDALSQGMESLQQSIaADT 

LSSGTLGSSSSDOTASYMGQMAMAMGKLGT^ 

ALIiAIHDYSSRLRALSSLWLARPRE* 

>G596 (168.. 1121) 

TAATTTCTCTACTTCAGATTTTTTTCTCCTTAGATTAATTTAATTGAGTTA 
CCTCAAGCTAAGATTCTGGTTTTGTGAGTTGAGTGGATGAGAAGAGGAGAGATTAACTAA 
ATTAGGGTTTCAATTGTTTACTTTTTGTTTGCTTTTTATATCAAGTAATGGATCAGGTC^ 
CTCGCTCTCTTCCTCCACCTTTTCTCT 

TCCAGCATCAGCAGCAGCAGCAGCAACAGAATCACGGCCACGATATAGACCAGCACCGAA 
TCGGTGGGCTAAAACGTGACCGAGATGCTGATATCGATCCCAACGAGCACTCTTCAGCCG 
GAAAAGATCAAAGTACTCCTGGCTCCGGTGGAGAAAGCGGCGGCGGAGGAGGAGGAGATA 
ATCACATCACGAGAAGGCCACGTGGCAGACCAGCGGGATCTAAGAACAAACCAAAACCGC 
CAATCATCATCACTCGAGACAGCGCAAACGCTCTCAAATCTCATGTCATGGAAGTAGCAA 
ACGGATGTGACGTCATGGAAAGTGTCACCGTCTTCGCTCGCCGTCGCCAACGTGGCATCT 
GCGTTTTGAGCGGAAACGGCGCCGTTACCAACGTTACCA^ 

CXGGTGGTGGCTCATCTGTCGTTAACTTACACGGACGTTTCGAGATTCTTTCTCTCTCGG 
GATCATTCCTTCCTCCTCCGGCTCCACCAGCTGCGTCAGGTCTAACGATTTACTTAGCCG 
GTGGTCAGGGACAGGTTGTTGGAGGAAGCGTGGTTGGTCCACTCATGGCTTCAGGACCTG 
TAGTGATTATGGCAGCTTCGTTTGGAAACGCTGCGTATGAGAGACTGCCGTTGGAGGAAG 
ACGATCAAGAAGAGCAAACAGCTGGAGCGGTTGCTAATAATATCGATGGAAACGCAACAA 
TGGGTGGTGGAACGCAAACGGAAACTGAGACGCAGGAGC^^ 

AAGATCCGACGTCGTTTATACAAGGGTTGCCTCCGAATCTTATGAATTCTGTTCAATTGC 
CAGCTGAAGCTTATTGGGGAACTCCGAGACC^TCTTTCTAAATCGCGAAGAAAAAACAAG 
TTAGATACGTTCGTTGTTTTTAATTTATAATCTCTCTTCTGTCAAGTTTTAATTTTCTTT 
TTCTTCTTCTTTGTTTTCTAAAGATAATTGTAGTCTTTGACGA 

GAATCGAAGAGAATCGTTTTGGTCATGGGATTGCTCGATCTATTAGGTTTGAGAGGGGGT 
TTGTGTTTTGCGTTGACTAGCAGATTATAAAATTGTTGATTTTCGAGTTTTTATTTTCAT 
GTGTTGGTGATAAA 

>G596 Amino Acid Sequence' (domain in AA coordinates: 89-96) 
I^QVSRSLPPPFLSRDLHIiHPHHQFQHQQQQQQQNHGHDIDQHRIGGLKRDRDADIDPNE 
HS S AGKDQS TPGS GGE S GGGGGGDNHI TRRPRGRPAGSKNKPKPP I 1 ITRDSANALKSHV 
MEVANGCDVMES VTVFARRRQRG I C VLSGNG AVTNVTIRQPAS VPGGGS S WNLHGRFE I 
LSL S GS FLPPPAPPAASGIiTI YLAGGQGQVVGGS VVGPLMASGPVVIMAAS FGNAAYERL 
PLEEDDQEfeQTAGAVANNIDGNATMGGGTQTQTQTQQQQQQQIiMQDPTS F I QGLPPNIiMN 
SVQLPAEAYWGTPRPSF* 
>G617 (59.. 1141) 

CAGATCTGTTCTTTACACCAAATTGAGTACTGAAGATCTTGTTGAGTGAATTAAAGAGAT 

GAGATCAGGAGAATGTGATGAAGAGGAGATTCAAGCAAAGCAAGAAAGAGATCAAAATCA 

AAATCATCAAGTAAACTTAAACCACATGTTGCAACAACAACAGCCGAGTTCGGTATCATC 

TTCAAGGCAATGGACTTCAGCTTTTAGGAATCCAAGAATCGTTCGAGTCTCAAGAACATT 

CGGTGGCAAAGACAGACACAGCAAAGTATGTACAGTCCGTGGTCTTCGAGACCGGAGGAT 

AAGGTTGTCCGTACCTACAGCTATTCAACTCTACGACCTTCAAGATCGATTAGGGCTGAG 

TCAGCCAAGCAAAGTCATTGATTGGTTACTCGAAGCAGCAAAAGATGACGTAGACAAGCT 

ACCTCCTCTACAATCXCCCACATGGATTTAACCAGATGTATCCAAATCTCATCTTCGGAAA 

CTCCGGGTTTGGAGAATCTCCATCTTCAACTACATCAACAACGTTTCCAGGAACCAATCT 

CGGGTTCTTGGAAAATTGGGATCTTGGTGGTTCTTCAAGAACAAGAGCAAGATTAACCGA 

TACAACTACGACCCAAAGAGAAAGTTTTGATCTTGATAAAGGAAAATGGATCAAAAACGA 

CGAGAATAGTAATGAAGATCATCAAGGGTTTAACACCAATCAT 

GACGAATCCGTACAACAACACTTG^ 

AGACCAATCTGGTAATAACGTTACTGTCGCAATATCTAATGTTGCTGCTAATAATAACAA 
TAATCTCAATTTGCATCCTCCTTCCTCGT 

TCCTACTCCTCCGGCAATGAGCTCTCTATTCCCGACATACCCTTCGTTTCTTGGAGCTTC 
TC^TC^TCATCATGTCGTCGATGGAGCCGGTC^TCTTC^GCTCTTTAGCTCGAATTCAAA 
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TACCGCATCGCAGCAACACATGATGCCGGGTAATACGAGTTTGATTAGACCATTTCATCA 
TTTGATGAGCTCGAATCATGATACGGATCATCATAGTAGCGATAATGAATCAGATTCTTG 
AATGATTTTATATATCTACACTATACATTGAAAATGTTATATGTATACGTATTCTTCTAT 
ATTTTGATATATATGCGTATTGTTGGATTGGTTTATGTATCT 

>G617 Amino Acid Sequence (domain in AA coordinates: 64-118) 

MRSGECDEEEIQAKQERDQNQNHQVl^niiNHMLQQQQPSSVSSSRQWTSAFKNPRIVRVSRT 

FGGKDRHSKVCTVRGIjRDRRIRLSVPTAIQLYDLQDRIjGLSQPSKVIDWLLEAAKDDVDK 

LPPLQFPHGFNQMYPNLIFGNSGFGESPSSTTSTTFPGTNLGFLENWDLGGSSRTRARLT 

DTTTTQRES FDLD KGKW I KKTOENSNQDHQGFlTrNHQQQFPLTITP YNNTS AYYITLGHIiQQS 

LDQSGNNVTVAJCSNVAANNN^^ 

SHHHHVVDGAGHLQLFSSNSNTASQQHIW^ 

* 

>G620 (40.. 666) 

GAATTGAACTTGGACCAGCACAGCAACAACCCAACCCCAATGACCAGCTCAGTCATAGTA 
GCCGGCGCCCjGTGACAAGAACAATGGTATCGTGGTCCAGGAGCAACCACCATGTGTGGCT 
CGTGAGCAAGACCAATACATGCCAATCGCAAACGTCATAAGAATCATGCGTAAAACCTTA 
CCGTCTCACGCCAAAATCTCTGACGACGCGAAAGAAACGATTCAAGAATGTGTCTCCGAG 
TACATCAGCTTCGTGACCGGTGAAGCCAACGAGCGTTGCCAACGTGAGCAACGTAAGACC 
ATAACTGCTGAAGATATCCrTTGGGCTATGAGCAAG 

CCCCTCACCGTGTTCATTAACCGGTACCGTGAGATAGAGACCGATCGTGGTTCTGCACTT 

AGAGGTGAGCCACCGTCGTTGAGACAAACCTATGGAGGAAATGGTATTGGGTTTCACGGC 

CCATCTCATGGCCTACCTCCTCCGGGTCCTTATGGTTATGGTATGTTGGACCAATCCATG 

GTTATGGGAGGTGGTCGGTACTACCAAAACGGGTCGTCGGGTCAAGATGAATCCAGTGTT 

GGTGGTGGCTCTTCGTCTTCCATTAACGGAATGCCGGCTTTTGACCATTATGGTCAGTAT 

AAGTGAAGAAGGAGTTATTCTTCATTTTTATATCTATTCAAAACATGTGTTTCGATAGAT 

ATTTTATTTTTATGTCTTATCAATAACATTTCTATATAATGTTGCTTCTTTAAGGAAAAG 

TGTTGTATGTCAATACTTTATGAGAAACTGATTTATATATGCAAAT 

>G620 Amino Acid Sequence (domain in AA coordinates: 20-118) 

MTSSVIVAGAGDKNNGIVVQQQPPCVAREQDQYMPIAN^ 

IQEOTSEYISFVTGEANERCQREQRKTITAEDILWAMSm 

TDRGSALRGEPPSLRQTYGGNGIGFHGPSHGLPPPGPYGYGMLDQSMVMGGGRYYQNGSS 

GQDESSVGGGSSSSINGMPAFDHYGQYK* 

>G625 (151.. 1137) 

AATCGACC^TTCACAACGATGACATTCAAACACTCTTCAG 

GTCCTCTCCACTATTTTTCTCAATTTCTTTAATCTCTCTCTTTCTCTCTCTACTTCCTCT 
TCCTCTTCTTCTTCTTCTTCTTCTTCATCTATGGACCCTTTAGCTTCCCAACATCAACAC 
AACCATCTGGAAGATAATAACCAAACCCTAACCCATAATAATCCTCAATCCGATTCCACC 
ACCGACTCATCAACTTCCTCCGCTCAACGCAAACGCAAAGGCAAAGGTGGTCCGGACAAC 
TCCAAGTTCCGTTACCGTGGCGTTCGACAAAGAAGCTGGGGCAAATGGGTCGCCGAGATC 
CGAGAGCCACGTAAGCGCACTCGCAAGTGGCTTGGTACTTTCGCAACCGCCGAAGACGCC 
GCACGTGCCTACGACCGGGCTGCCGTTTACCTATACGGGTCACGTGCTCAGCTCAACTTA 
ACCCCTTCGTCTCCTTCCTCCGTCTCTTCCTCTTCCTCCTCCGTCTCCGCCGCTTCTTCT 
CCTTCCACCTCCTCTTCCTCCACTCAAACCCTAAGACCTCTCCTCCCTCGCCCCGCCGCC 
GCCACCGTAGGAGGAGGAGCCAACTTTGGTCCGTACGGTATCCCTTTTAACAACAACATC 
TTCCTTAATGGTGGGACCTCTATGTTATGCCCTAGTTATGGTTTTTTCCCTCAACAACAA 
CAACAACAAAATCAGATGGTCCAGATGGGACAATTCCAACACCAACAGTATCAGAATCTT 
CATTCTAATACTAACAATAACAAGATTTCTGACATCGAGCTCACTGATGTTCCGGTAACT 
AATTCGACTTCGTTTCATCATGAGGTGGCGTTAGGGCAGGAACAAGGAGGAAGTGGGTGT 
AATAATAATAGTTCGATGGAGGATTTGAACTCTCTAGCTGGTTCGGTGGGTTCGAGTCTA 
TCAATAACTCATCCACCGCCGTTGGTTGATCCGGTATGTTCTATGGGTCTGGATCCGGGT 
TATATGGTTGGAGATGGATCTTCGACCATTTGGCCTTTTGGAGGAGAAGAAGAATATAGT 
CATAATTGGGGGAGTATTTGGGATTTTATTGATCCCATCTTGGGGGAATTCTATTAATTT 
GTTTTTGTGGAAGATCATATTATATACGATGAGCATCCCTAAGGTCGGTCAAGAGCATTG 
GAGATTCATTGTTGAGAGGAATCAAAGAGATTGCATTCTATGAGGAGCTCTGCATGCAAA 
ATTTTGGAGGATTTTTTTACTACCTATAGAGATAAATAAGAGGGTATTTTTATTATTTTT 
TTGAAGATTTTTATTTTCAAGGAATTCGTAAAA 

TATGTGGAAGAGAATCGGAGGAGATGGTGGAAAGTTGTATGGGAATTTTATTGGTTCAAC 
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ACTTCCTTCACAGTGTGCCTACCTTAATATATAATTATTGATAGGATATGATAATTTCTG 

>G625 Amino Acid Sequence (conserved domain in AA coordinates : 52 -119) 

MDPLASQHQHNHLEDNNQTLTHNNPQ 

RS WGKWVAE IREPRKRTRKWLGTFATAEDAARAYDRAAVYIiYGSRAQIiNIjTPS s PS s vs s 

ssssvsaasspstsssstqtlrpllprpaaatvggganfgpygipfi^miflnggtsmlc 
psygffpqqqqqqnq^qmgqfqhqqyqnlhsntnnnki sdi eltdvpvtnsts fhheva 
lgqbqggsgclsnrassmedlnsiiagsvgsslsithppplvdpvcsmgldpgymvgdgssti 
wpfggeeeyshnwgsiwdfidpiiigefy* 

>G658 (17.. 757) 

CCACGCGTCCGCTCACATGAACAAAGGAGCTTGGACTAAAGAAGAAGATCAGCTTCTTGT 

TGATTACATCCGTAAACACGGTGAAGGTTGCTGGCGATCTCTCCCTCGCGCCGCTGGATT 

ACAAAGATGTGGTAAGAGTTGTAGATTGAGATGGATGAATTATCTAAGACCAGATCTCAA 

AAGAGGCAATTTTACTGAAGAAGAAGATGAACTCATCATCAAGCTCCATAGCTTGCTCGG 

TAACAAATGGTCTTTAATAGCTGGGAGATTACCAGGAAGAACAGATAACGAGATCAAGAA 

CTATTGGAACACTCATATCAAGAGGAAGCTTCTGAGCCGTGGGATTGATCCAAACTCTCA 

CCGTCTGATCAACGAATCCGTCGTGTCTCCGTCGTCTCTTCAAAACGATGTCGTTGAGAC 

TATACATCTTGATTTCTCTGGACCGGTTAAACCGGAACCGGTGCGTGAAGAGATTGGTAT 

GGTTAATAATTGTGAGAGTAGTGGAACGACGTCGGAGAAGGATTATGGGAACGAGGAAGA 

TTGGGTGTTGAATTTGGAACTCTCTGTTGGACCGAGTTATCGGTACGAGTCGACTCGGAA 

AGTGAGTGTTGTTGACTCGGCTGAGTCGACTCGACGGTGGGGTTCCGAGTTGTTTGGAGC 

TCATGAGAGTGATGCGGTGTGTTTGTGTTGTCGGATTGGGTTGTTTCGTAATGAGTCGTG 

TCGGAATTGTCGGGTTTCTGATGTTAGAACTCATTAGAGAGTCAATCGAGAATTCTTTAG 

GAATCTTTTTATATATTTAGATCGTCAATTGTGTTTTTTTTTTGTTC^ 

AACATCAAGTAAGAAACTAGCATAATTATTTGATGGCAAAGCCAAAAGATTGTGCTC 

>G658 Amino Acid Sequence (domain in AA coordinates: 2-105) 

MNKGAWTKEEDQLLVDYIRKHGEGCWRSLPRAAGL^ 

EEEDELIIKLHSLLGNKWSLIAGRLPGRTDire^ 

SWSPSSLQOTVV^TIHLDFSGPVTCPEPVREEIGMVNNCESSGTTSE 

ELSVGPSYRYESTRKVSVVDSAESTRRWGSELFGAHESDAVCLCCRIGLFRNESCRNCRV 
SDVRTH* 

>G716 (271.. 2079) 

aaaaaaaaaggggagagatttagttttatccnncagngcctgaantacgttctgcaatca 
anacggacataaccgnccgttgtgtcctgtttataaagttttgctttttttattttctcc 
antgatgggtcttttctttcttctctctc 

TTTACCGCGTGAAGGTTTTTTTTTCTTTCTATTTTCTTTCATTTCCTCTCCTTCTACTTC 
TTCTTCTCCAGTTCTCATCTGGGTTCTTCAATGGCGAGTGTTGAAGGTGATGATGATTTC 
GGAAGTTCTTCGTCAAGGTCTTATCAAGATCAACTATAGACAGAGCTATGGAAAGTTTGT 
GCAGGTCCATTAGTGGAAGTTCCTCGTGCTCAAGAGAGAGTTTTCTACTTCCCTCAGGGT 
CAC^TGGAACAACTTGTGGCGTCAACTAATC^AGGAATC^TTCAGAAGAAATACCTGTT 
TTTGATCTTCCTCCAAAGATACTTTGTCGAGTTCTTGATGTCACTTTAAAGGCGGAGCAT 

gaaacagatgaggtttacgctcagatcacattacaaccagaggaagatcaaagtgaacca 

acaagtcitgatcc^cctattgttggaccaactaagcaagagtttcatt 

attttaacggcttcagatacaagcactcatggtggattctctgttcttcgtaaacacgcc 

actgaatgcttgccttctttggatatgacac^gctactcctactcaagaacttgtgact 

agagatcttcatggctttgaatggaggtttaagcatatattcagaggacaaccacggagg 

catttgcttactacgggttggagtacatttgtatcctcgaaaagacttgtagctggagat 

gcttttgtgttcttgaggggtgagaatggggatttacgggttggagtgagacgattagct 

cggcatcaaagcacaatgcctacttcggttatttcaagtcagagcatggatttgggagtt 

cttgctacagcttctcatgctgtgcgtacaacaag^tctttgttgtcttttacaagcct 

aggataagccaattcatagttggggtgaacaagtatatggaagctataaagcatggattt 

tctctcggtacccgattcagaatgaggtttgaaggagaagagtctcctgagagaatattt 

actggtacgattgtgggaagtggagatctatcttcacaatggccagcttctaaatggagg 

tcattgcaggtag^tgggatgagccaagz^<^^ 

tgggagatagagcctttcttggcaacttccccaattt 

tcgaaatgcaagcggtcaagacccatcgagccat<^gttaaaacac<^gccccacctagt 

TTCTTGTACAGCCTCCCTCAGAGCCAAGATTCCAT^ 

GATCCATCACTTGAGAGAATTTCAGGTGGATACTCCTCAAACAACAG CTTCAAACC CGAG 
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ACTCCTCCTCCTCCAACGAATTGTAGCTATAGGTTGTTTGGATTTGATCTCACAAGCAAT 
TCTCCTGCTCCAATCCCTCAAGACAAGCAACCGATGGATACTTGTGGAGCTGCCAAGTGT 
CAAGAACCCATCACTCCAACCTCAATGAGTGAGCAGAAGAAGCAAC^ 

CGAACTAAAGTGCAAATGCAAGGCATTGCGGTTGGTCGTGCGGTTGATTTAACACTGTTG 
AAATCTTACGATGAACTGATTGATGAGCTTGAGGAGATGTTTGAGATTCAAGGACAGCTT 
CTTGCCCGAGACAAATGGATCGTTGTCTTCACTGATGATGAAGGAGATATGATGCTTGCT 
GGTGATGATCCGTGGAATGAGTTTTGCAAGATGGCAAAGAAGATATTTATATATTCGAGC 
GATGAGGTTAAGAAAATGACAACGAAACTGAAGATTTCTTCGTCGTTAGAGAATGAGGAA 
TATGGTAATGAATCATTCGAAAATCGTAGTAGGGGGTGAGAGTTTTAGCTGTTAATTAAG 
GTTAATTCGGCGACGTCGTTTTAGTGCGTAAGTGTCTAAAGACT J l , l u rTTTTTAGTCTGTG 
TATATAAAGTCTTGTCCTCTTTTTCATGTCAATTTTTCAAGTTGGCGATTTAATATTTCG 
GTTTTGGGACAGTGGTTGATGGGGCGGTTTTACATTTTTTATGTGTATGTACTTGTTCCA 
AAACCATTCAATTTTCAAA 

>G716 Amino Acid Sequence (domain in AA coordinates: 24-355) 

MASVEGDDDFGSSSSRSYQDQLYTELWKVCAGPLVEVPRAQERVFYFPQGHMEQIjVASTN 

QGINSEEIPVFDLPPKILCRVIjDVTIiKAEHETDEVYAQITLQPEEDQSEPTSIjDPPIVGP 

TKQEFHSFVKILTASDTSTHGGFSVIjRKHATECLPSIiDMTQATPTQEIjVTRDIjHGFEWRF 

KHI FRGQPRRHLLTTGWSTFVS S KRLVAGDAFVFIjRGENGDLRVGVRRXiARHQSTMPTS V 

I S S QSMHLGVLATASHAVRTTTI FWF YKPRI SQF I VG WKYMEAI KHGFS LGTRFRMRF 

EGEESPERIFTGTIVGSGDLSSQWPASKWRSIiQVQWDEPTTVQRPDKVSPWEIEPFIiATS 

PISTPAQQPQSKCKRSRPIEPSVKTPAPPSFLYSLPQSQDSINASIiKIjFQDPSLERISGG 

YSSNNSFKPETPPPPTNCSYRLFGFDLTSNSPAPIPQDKQPMDTCGAAKCQEPITPTSMS 

EQKKQQTSRSRTKVQMQGIAVGRAVDLTLIiKSYDELIDELEEMFEIQGQLIiARDKWIVVF 

TDDEGDMMLAGDDPWNEFCKMAKKIFIYSSDEVKKM^ 

RG* 

>G725 (46.. 1122) 

CCTCTTT(^GAGAGAGAAAGAGAGTCAGAGAGAGAGAGAGAGAGAATGTTCCATGCTAAG 

AAACCTTCAAGTATGAATGGTTCATATGAGAACAGAGCTATGTGCGTTCAAGGCG^ 

GGCCTTGTCCTCACCACCGACCCTAAACCGCGTTTGCGTTGGACCGTCGAACTCCACGAG 

CGTTTTGTGGACGCCGTCGCTCAGCTCGGCGGCCCCGA(^U^GC(^CCCCAAAGACGATT 

ATGAGAGTTATGGGTGTGAAGGGTCTTACTCTTTACCACCTAAAGAGCCATCTTCAGAAA 

TTC^GGCTTGGAAAGCAGCCGCACAAGGAGTACGGAGATCACTCCACAAAGGAAGGTTCA 

AGAGCTTCTGCCATGGATATTCAGCGCAACGTAGCTTCTTCTTCTGGCATGATGAGTCGC 

AACATGAATGAGATGCAAATGGAAGTGCAGAGAAGGTTGCATGAACAGCTAGAGGTGCAA 

AGACATCTGCAACTGAGGATTGAAGCACAAGGAAAGTACATGCAATCTATCTTGGAGAGA 

GCTTGCCAAACCCTAGCCGGTGAGAACATGGCAGCCGCCACCGCAGCAGCCGCCGTCGGA 

GGAGGATACAAGGGTAATCTGGGAAGTTCGAGTCTTTCAGCAGCGGTGGGCCCACCTCCT 

CATCCTCTTAGTTTCCCGCCGTTTCAAGACCTAAACATCTATGGAAACACAACCGACCAA 

GTCCTCGACCATCACAACTTCCATCATCA^ 

GCTGCAGACACCAACATTTACTTGGGGAAGAAGCGACCTAATCCTAATTTTGGTAACGAT 
GTAAGGAAAGGACTATTGATGTGGTCTGATGAAGATCACGATCTTTCCGCAAACCAATCG 
ATCGATGATGAGCATAGAATTCAGATACAGATGGCTACACATGTCTCCACGGATTTGGAT 
TCTTTGTCGGAGATCTACGAAAGGAAATCAGGTTTATCAGGTGATGAAGGGAATAATGGT 
GGGAAATTACTGGAAAGGCGATCGCCTAGGAGATCACCATTGAGTCCTATGATGAACCCT 
AATGGTGGATTAATACAAGGAAGAAACTCGCCATTTGGGTGATACAATTTATTAATTTTT 
ATCTATGAGTGATGCATGGGAATGTAAGAACGAGATATATATGTTTTGTCATTGTGAGTT 
TGACGTAGGGTTTAGAGAAAA 

>G725 Amino Acid Sequence (domain in AA coordinates: 39-87) 
MFHAKKPSSMNGSYENRAMCVQGDSGLVIiTTDPKPRL 
TPKTIMRVMGVKGLTkYHLKSHLQKFRL^ 
GMMSRNMNEMQMEVQRRLHEQLEVQRHLQI^ 

AAAVGGGYKGI^GSSSLSAAVGPPPHPLSFPPFQDLNIYGNTTDQVIjDHHNFHHQNIENH 
FTGKNAADTNIYLGKKRPNPNFGl^VRKGLLMWSDQDHDLSANQSIDDEHRIQIQMATHV 
STDLDSLSEIYERKSGLSGDEGIMGGKLIjERPSPRRSPLSPMMNPNGGLIQGRNSPFG* 
>G727 <43 . .1977) 

CTTCTTCTCCTTCTCTGATCGTTCGTTTTCTGGACGAGAGAGATGGTAAATCCGGGTCAC 
GGAAGAGGACCCGATTCGGGTACTGCTGCTGGTGGGTCAAACTCCGACCCGTTTCCTGCG 
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AATCTTCGAGTTCTTGTCGTTGATGATGATCCAACTTGTCTCATGATCTTAGAGAGGATG 
CTTATGACTTGTCTCTACAGAGAGCAGAGAGCGCATTGTCTCTGCTTCGGAAGAACAAAG 
AATGGTTTTGATATTGTCATTAGTGATGTTCATATGCCTGACATGGATGGTTTCAAGCTC 
CTTGAACACGTTGGTTTAGAGATGGATTTACCTGTTATCAATCTGAATGTTTTGAAACCT 
TTGGTTATAGTGATGTCTGCGGATGATTCGAAGAGCGTTGTGTTGAAAGGAGTGACTCAC 
GGTGCAGTTGATTACCTCATCAAACCGGTACGTATTGAGGCTTTGAAGAATATATGGCAA 
CATGTGGTGCGGAAGAAG CGT AACGAGTGGAATGTTTCTGAACATTCTGGAG GAAGTATT 
GAAGATACTGGCGGTGACAGGGACAGGCAGCAGCAGCATAGGGAGGATGCTGATAACAAC 
TCGTCTTCAGTTAATGAAGGGAACGGGAGGAGCTCGAGGAAGCGGAAGGAAGAGGAAGTA 
GATGATO^GGGGATGATAAGGAAGACTC^TCGAGTTTAAAGAAACCACGCGTGGTTTGG 
TCTGTTGAATTGCATCAGCAGTTTGTTGCTGCTC 

TTAAAAACTTGCTTGCTTATGCATTTGTGTGTGTCGATTGGTAACATTGTGGAATTCCA^ 

AAGTATCGGATATATCTGAGACGGCTTGGAGGAGTATCGCAACACCAAGGAAATATGAAC 

CATTCGTTTATGACTGGTCAAGATCAGAGTTTTGGACCTCTTTCTTCGTTGAATGGATTT 

GATCTTCAATCTTTAGCTGTTACTGGTCAGCTCCCTCCTCAGAGCCTTGCACAGC 

GCAGCTGGTCTTGGCCX3GCCTACACTCGCTAAACCAGGGATGTCGGTTTCTCCCCTTGTA 

GATCAGAGAAGCATCTTCAACTTTGAAAACCCAAAAATAAGATTTGGAGACGGACATGGT 

CAGACGATGAACAATGGAAATTTGCTTCATGGTGTCCCAACGGGTAGTCACATGCGTCTG 

CGTCCTGGACAGAATGTTCAGAGCAGCGGAATGATGTTGCCAGTAGCAGAC<^GCTACCT 

CGAGGAGGACCATCGATGCTACCATCCCTCGGGCAACAGCCGATATTGTCAAGCAGCGTT 

TCAAGAAGAAGCGATCTC^CTGGTGCGCTGGCGGTTAGAAACAGTATCCCCGAGACCAAC 

AGCAGAGTGTTACCAACTACTCACTCGGTCTTCAATAACTTCCCCGCGGATCTACCTCGC 

AGCAGCTTCCCGTTGGCAAGTGCCCCAGGGATTTCAGTTCCA.GTATCAGTTTCTTACCAA 

GAAGAGGTCAACAGCTCGGATGCAAAAGGAGGTTCATCAGCTGCTACTGCTGGATTTGGT 

AACCCAAGCTACGACATATTTAACGATTTTCCGCAGCACCAACAGCACAACAAGAACATC 

AGCAATAAACTAAACGATTGGGATCTGCGGAATATGGGATTGGTCTTCAGTTCCAATCAG 

GACGCAGCAACTGCAACCGCAACCGCAGCATTTTCCACTTCGGAAGCA 

TCTACG(^GAGAAAAAGACGGGAAACGGACGCAACAGTTGTGGGTGAGCATGGGCAGAAC 

CTGC^GTCACCGAGCCGGAATCTGTATCATCTGAACCACGTTTTTATGGACGGTGGOT 

GTCAGAGTGAAGTCAGAAAGAGTGGCGGAGACAGTGACTTGTCCTCCAGCAAATACATTG 

TTTt^CGAGC^GTATAATCAAGAAGATCTGATGAGCGCATTTCTCAAACAGGTTTGATTA 

TTACTCGAATACAGTGCACTCTAAAAC 

>G727 Amino Acid Sequence (domain in AA coordinates: 226-269) 

MVNPGHGRGPDSGTAAGGSNSDPFPANLRVTjVVDDDPTC^ 

CFGRTKNGFDIVISDVHMPDMDGFKLLEHVGLE^^ 

LKGVTHGAVDYIjI KPVRI BALKNI WQHVVRK3CRNEWNVS EHSGGS IEDTGGDRDRQQQHR 
EDADNNS S SVNEGNGRS SRKRKEEEVDDQGDDKEDS S SLK3CPRVVWS VELHQQFVAAVNQ 
LGVDSELKTCLLMHLCVS IGNI 

S S IiNGFDLQSLAVTGQLPPQSIjAQIjQAAGLGRPTIjAKPGMSVS PLVDQRS I FNFENPKIR 
FGDGHGQTMNNGNIiLHGVPTGSH^^ 

ILS S S VSRRSDIiTGAIiAVRNS I PETNSRVLPTTHS VFNNFPADLPRS S FPLAS APGI S VP 

vsvsyqeewssdakggssaatagfgitosydifndfpqhqqhn™ 

VFSSNQDAATATATAAFSTSEAY^SSSTQRKRRETDAT\A^GEHGQNIjQSPSR1^YHIjNHV 

fmdggsvrvkservaetvtcppantlfheqynqedlmsaflkqv* 

>G740 (25.. 924) 

CTTCTTC^CTTTTTTTTTTAACGATGGCTTCAGAGGATCAATCGGCGGCGAGATCTACC 
GGGAAGGTGAACTGGTTCAACGCTTCTAAAGGCTATGGTTTCATTACTCCTGACGATGGC 
AGCGTAGAGCTTTTeGTTCATCAATCTTCAATTGTCTCCGAAGGTTACCGGAGTTTAACC 
GTCGGCGATGCGGTTGAGTTCGCTATTACTCAGGGAAGCGACGGTAAGACTAAAGCCGTC 
AATGTTACTGCTCCTGGTGGTGGTTCTCTCAAGAAGGAGAATAACTCTCGTGGTAACGGT 
GCTAGGCGCGGCGGCGGTGGAAGCGGTTGCTACAATTGCGGTGAGTTAGGTCATATCTCT 
AT^AGATTGTGGTATTGGTGGCGGCGGCGGAGGTGGTGAACGTAGATCTAGAGGAGGAGAA 
GGTTGTTACAATTGTGGTGATACTGGTCACTTCGCTAGGGATTGTACTTCAGCTGGAAAC 
GGTGACCAACGTGGAGCCACCAAAGGTGGAAACGATGGTTGCTACACTTGCGGTGATGTT 
GGTCACGTGGCTAGGGATTGTACTCAGAAATCAGTTGGAAACGGAGACCAACX5TGGAGCG 
GTCAAAGGTGGAAACGATGGTTGCTACACTTGTGGTGATGTTGGTCACTTTGCTAGGGAT 
TGTACTCAGAAGGTTGCTGCCGGAAACGTCAGAAGCGGTGGTGGTGGTAGTGGAACTTGT 
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TATTCATGCGGTGGAGTTGGTCACATTG(^^GAGATTGTGCGACTAAGAGACAGCCTTCT 

CGTGGGTGTTACCAGTGTGGTGGTTCTGGTCACTTGGCTCGTGATTGTGACCAGAGAGGA 

AGCGGTGGAGGAGGTAATGATAATGCGTGCTACAAGTGTGGTAAGGAAGGTCACTTTGCA 

AGGGAATGTTCTTCTGTAGCTTAATCGATTTCCTAATCAACAAAACAAAAA 

GAAATTGAATCGAGTTATATAGTTTGGTATATATTACTCOT 

TTTTGTTGTTGATGGGAATGAAATTGCCTGGTCCTTTTGGTGTGTT^ 

ATTATAC^GAGTGATCCCTTTTTTGTTATAACTATTACAAGTTTT 

TGGATGCTCTCTCCTTTTCTTCTATCTGTTTCTGGAAATTTTGACCTCATCATATTACTT 
ATGTCATCCAAA 

>G740 Amino Acid Sequence (domain in AA coordinates: 24-42, 232-268) 

MASEDQSAARSTGKVNWFNASKGYGFITPDDGSVBLFVHQSSIVSEGYRSIiTVGDAVEFA 

ITQGSDGKTKAVNVTAPGGGSLKKEINNSRGNGARRGGGGSGCYNCGELGHISKDCGIGGG 

GGGGERRSRGGEGCYNCGDTGHFARDCTSAGNGDQRGATKGGNDGCYTCGDVGHVARDCT 

QKSVGNGDQRGAVKGGNDGCTTCGDVGHFARDCTQKVAAGNVRSGGGGSGTCYSCGGVGH 

XARDCATKIlQPSRGCYQCGGSGHIjARDCDQRGSGGGGiroNACYKCGKEGHFARECSSVA* 

>G770 (119.. 1069) 

CCTTCCTCTATATAAGGAAGTTCATTTCATTTGGAGAGGACACGCTGACAA.GCTGACTCT 

AGCAGATCTGGTACCGTCGACGGTTCTTGGATTTGGAGTAAACTAAAGATCA 

GGAACAAGGAGATCATCAGCAGCATAAGAAAGAAGAAGAAGCTTTGCC^ 

ATTTCATCCGACX^ATGAGGAGCTAATCTCATATTACTTGGTTAATAAGATTGCCGATCA 

AAACTTCACCGGGAAAGCAATCGCTGACGTTGATCTTAACAAGTCCGAGCCATGGGAGCT 

TCCTGAGAAGGCGAAAATGGGAGGAAAAGAATGGTACTTTTTTAGCCTCCGGGACCGGAA 

GTACCCGACGGGAGTGAGGACGAATAGGGCGACGAATACAGGATATTGGAAAACCACAGG 

AAAAGACAAAGAGATATTGAB/TAGGACAACCTCG 

GGTCTTTTACAGAGGACGAGCTCC^CGTGGGGAGAAGACTTGTTGGGTC^TGCATGAGTA 

TCGACTTCACTCCAAGTCCTCATATAGAACCTCCAAGCAAGACGAGTGGGTAGTGTGTAG 

AGTGTTCAAGAAAAGAGAAGCAACCAAGAAATAGAT^ 

TCATCACCACAACAACGACACAAGAGCCTC^^ 

TTACTCATCAGACCTCCTTC^CTCCC^ 

TAACCAATCCCTCATGGCAAACGCCGTTCACCTAGCTGAGCTCTCAAGAGTCTTCCGTGC 
CTCTACAAGCACCACCATGGACTCTTCTC^TCAGGAGCTAATGAACTACACCCACATGCC 

TCTTGAGGATGTTGCCGCGGTTAGTGCTTCGTACAATGGCGAAAACGGGTTTGGAAATGT 
GGAGATGAGCCAGTGCATGGACTTGGATGGATACTGGCCATCTTATTGATTGGTAATTGT 
CAGTTTAAGTTATGGTTTTTATATTGTTTCCATTTACTTGTTGGTAAAACGATTTTGGTT 
GTTCTTGCGAACGCTCTAGACAGGCCTCGTACCGGATCCTCTAGCTAGAGCTTTCGTTCG 
TATCATCGGTTTC 

>G770 Amino Acid Sequence (domain in AA coordinates: 19-162) 

MEQGDHQQHKKEEEALPPGFRFHPTDEELISYYLVN^ 

IiPEKAKMGGKEWYFFSIiRDRKYPTGVRTNRATNT^ 

LVF YRGRAPRGEKTCWVMHE YRIiHSKS S YRTS KQDEWVVCRVFKKTEATKKYI S TS S S ST 
SHHHNNHTRAS I LSTNNNNPNYS SDLLQLPPHLQPHPSLNINQSLMANAVHLAELSRVFR 
ASTSTT^SSHQQLMNYTHMPVSGIjNI^ 
VEMS QCMDLDG YWPS Y * 
>G858 (99.. 869) 

CATAATCTCTTCTCTCTATATCTCTTCTCTTCTTCTTTTACCCTGTTTTTTTTTTCATTC 
CACAGAGCCCAGGTTGATTGATTTTGTTATTCAGAGATATGGGGAGAGGAAGGATTGAGA 
TTAAGAAGATTGAGAATATCAACAGTCGTC^GTCACTTTCTCTAAGAGACGAAACGGTT 
TGATCAAGAAGGCTAAAGAGCTTTCGATTCTCTGTGACGCCGAGGTTGCTCTTATCATCT 
TCTCCAGCACCGGGAAGATTTACGATTTCTCCAGCGTCTGTATGGAGC?UU^TTCTTTCTA 
GATATGGATACACTACTGCGTCCACTGAGCATAAACAACAAAGAGAACACCAACTTCTAA 
TTTGTGCTTCACATGGAAATGAAGCTGTGTTGCGAAATGATGATTCTATGAAGGGGGAAC 
TTGAAAGATTACAGCTTGCAATTGAGAGACTTAAGGGTAAGGAGCTTGAAGGTATGAGTT 
TCCCGGATCTTATTTCTOTTGAAAACCAGTTGAACGAGAGCTTGCATAGTGTCAAGGATC 
AAAAGACACAAATCCTGCTCAACCAGATTGAGAGATCCAGGATACAGGAGAAAAAAGCAT 
TGGAAGAAAACCAAATCTTGCGCAAACAGGTTGAGATGTTGGGGAGAGGTTCAGGACCAA 
AAGTGTTGAATGAAAGGCCTCAAGATTCTAGCCCAGAAGCCGATCCCGAGAGCTCTTCAT 



136 



WO 03/013227 PCT/US02/25805 

137/286 



CAGAAGAGGATGAGAATGACAACGAGGAGCACCATTCCGACACTTCCTTGCAGTTGGGGT 

TGTCGTCGACGGGGTATTGCACAAAGAGAAAGAAGCCGAAGATCGAACTGGTCTGCGATA 

ACTCTGGGAGTCAAGTGGCTTCTGATTGATGGAATCGATTATTTTTCTAATTCTGGTTGT 

TTAGGGGTCTCTATGTGTCTTCTTGTTTCT 

TAGAGTTTTCTTAATGTTTAGGTGGAAC^TTTTT^ 

TCAATAACATTAGATTTTCTTAGTTAAAGACTT7^AAGTTGCCCACACACCACACCATATG 

TGATTATGATGAATTTAC^TTTTATAAAAAAAAAAAAAAAAAAAAAAAAA 

>G858 Amino Acid Sequence (domain in AA coordinates: 2-57) 

MGRGRI E I KKI ENTNSRQVTFSKRRNGIiI KKAKELS I LCD AEVAL 1 1 FS STGKI YDFS S V 

CMEQUjSRYGYTTASTEHKQQREHQIjL I CASHGNEAVLRNDDSMKGELERLQLAIERLKG 

KELEGMSFPDLISLENQLNESLHSVKDQKTQILLNQIERSRI^ 

LGRGSGPKVLNERPQDSSPEADPESSSSEEDE1TONEEHHSDTS 

KIELVCDNSGSQVASD* 

>G&65 (282.. 920) 

ATCCCGAXZTTGTTGTTCATCACCAAGCCAAGC 

CTATCA.TCATCAATTCGTTTCAAACTTAGTTCCTTTCAAAGTCTTGTACATATATACACA 
CACACCrATTATTCTCTTGGTGTGTTTGTGTGTTA(^TATACGTGTGAGTACATACTTTG 
TTGTAAAAGTGGATCGGAGGTATGGAAAGGGACCGGTTCCACCGGAAACATCGGCGGCGG 
CGGATGATAATTCGTCTTGGAACGAGACTGATGTC^CCXSCC^TGGTCTCCGCTCTC^GCC 
GTGTCATAGAGAATCCGACAGACCCGCCGGTCAAACAAGAGCTTGATAAATCGGATCAA.C 
ATCAACCAGACCAAGATCAACCZ^GAAGAAGACACTATAGA 

GGGGTAAATGGGCGGCAGAAATCCGCGATCCAAAGAAAGCAGCCCGTGTCTGGCTCGGGA 

CTTTCGAGACGGCAGAGGAAGCTGCTTTAGCCTATGACCGAGCTGCCCTCAAATTCAAAG 

GCACCAAGGCTAAACTGAACTTCCCTGAACGGGTCCAAGGCCCTACTACCACC^ 

TTTCTCATGCACCAAGAGGAGTTAGTGAATCCATGAACTCACCTCCTCCTCGACCTGGTC 

GACCTTCAACTACTACTACTTCGTGGCCAATGACTT^ 

CTCAGTTGCTTACGAGTAACAATGAGGTTGATTTATCATACTACACGTCGACTCTCTTCA 

GTCAACCTTTTTCAACGCCTTCTTGATCT^ 

AGCTAGAACAACAACAACAGCAGCGTGAAGAAGAAGAGA 

ATAACTACCCAAGAGAATAATCTAATTATTATTGTTGGTCGAATCAGTTTTATAAATAGC 

TATCATAGTTTCATTTTTGGTTTCCGTAACCTTTGTTGCATGGAAAATATGAATGAACGA 

GGGACATGTGTAACAATTTGTTTGTGTTTCGTAAATGTTAGTTGTATTTGGATTTGCTGA 

AGTTTGATTTTCTGAGCATAAATCATTTGACGGTCAAAAAAAAAAA 

>G865 Amino Acid Sequence (domain in AA coordinates: 36-103) 

MVSALSRVIENPTDPPVKQELDKSDQHQPDQDQPRRRHYRGVRQRPWGKWAAEIRDPKKA 

ARVWLGTFETAEEAALAYDRAALKJFKGTKAKI 

PPPRPGPPSTTTTSWPMTYNQDILQYAQLLTSNNEVDLSYYTSTLFSQPFSTPSSSSSSS 
QQTQQQQLQQQQQQREEEEKNYGYNYYNYPRE * 
>G872 (59.. 646) 

CCGGAAAGAGAATCCAATTCAACCAAACCGAATCGAACCGAACCGGAGTTTTTATCCAAT 

GGTGAAGCAAGCGATGAAGGAAGAGGAGAAGAAGAGAAACACGGCGATGCAGTCAAAGTA 

CAAAGGAGTGAGGAAGAGGAAATGGGGAAAATGGGTATCGGAGATCAGACTTCCACACAG 

CAGAGAACGAATTTGGTTAGGCTCTTACGACACTCCCGAGAAGGCGGCGCGTGCTTTCGA 

CGCCGCTC^TTTTGTCTCCGCGGCGGCGATGCTAATTTCAATTTCCCTAATAATCCACC 

GTCGATCTCCGTAGAAAAGTCGTTGACGCCTCCGGAGATTCAGGAAGCTGCTGCTAGATT 

CGCTAACACATTCCAAGACATTGTCAAGGGAGAAGAAGAATCGGGTTTAGTACCCGGATC 

CGAGATCCGACCAGAGTCTCCTTCTACATCTGCATCTGTTGCTACATCGACGGTGGATTA 

TGATTTTTCGTTTTTGGATTTGCTTCCGATGAATTTCGGGTTTGATTCCTTCTCCGACGA 

CTTCTCTGGCTTCTCCGGTGGTGATCGATTTACAGAGATTTTACCCATCGAAGATTACGG 

AGGAGAGAGTTTATTAGATGAATCTTTGATTCTTTGGGATTTTTGAATTCCCAAACATAA 

TATTTTTTTAGAGCGAACTGTGAGATTTTCCTTGGAGTCATGGAGAAATCTGGAGATTTT 

TTGTAACACGGAGCTCCAATGACCCGGGAATTTCTTTCGTTTCGGATCCGAATTTGATGT 

GGATCATATTCACACCTATATTTTTTCATTTTTTTGTTGTAAAGAAAAATCGGATAAGAT 

TCTAGTAATAAATGTTAAAAGTC CATTTCATTAAAAAAAAAAAAAAAAAAA 

>G872 Amino Acid Sequence (domain in AA coordinates: 18-85) 

IWKQAMKEEEKKRNTAMQSKYKGVRKRKWGKWVSEI 

DAAQFCLRGGDANFNFPNNPPSISVEKSLTPPEIQEAAARFANTFQDrVKGEEESGLVPG 
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SEIRPESPSTSASVATSTVDYDFSFLDLLPMNFGFDSFSDDFSGFSGGDRFTEILPIEDY 

GGESLLDESLILWDF* 

>G904 (1. .1005) 

atggaatctctcatcaatcccagccatggcggaggaaactacgattctcactcttcttct 
ctcgatagtctcaaaccaagcgtactagtcatcattctcattctcctcatgactcttctc 
atctccgtttccatttgcttcctcctccgctgtctcaatcgctgtagccaccgctccgtt 
ctccctctttcatcttcctcttccgtcgcaaccgtaacttccgattcccgacgattctct 
ggacatcgagtctctcccgaaacagaacggtcctccgtgcttgattcgcttccgattttc 
aaattctcctccgtcactcgccgatctagctccatgaattccggagattgcgccgtttgt 
ttgtcgaaattcgaaccggaggatcagctccgtcttcttcctctctgttgtcacgctttt 
cacgccgattgtatcgatatctggctagtctctaaccagacttgtcctctctgtcgctct 
cctctcttcgcttcagaatctgatctcatgaagtctctcgccgtcgtcggctcaaacaac 
ggcggaggagaaaacagcttccgtctcgaaatcggatccatcagccgtcgtcgtcaaaca 
ccgattccagaatccgttgagcagcatcgaacttactcaatcggttcgttcgattacata 
gtagacgacgtagattcagaaatctcagagtcaaatttcaaccgtggaaaacaggaagac 
gcgactacaacaactgccacagcaacggcggttacgactaatccgacgtcgtttgaagct 
agtttagcggcggatataggtaacgatggttctagaagctggctcaaggattacgttgac 
agactctcacgaggtatatcgtcgcgtgcaatgtcgtttagaagctctggtagatttttt 
actgggagtagtcgtcggagtgaggaattgacggtgatggatttagaagcgaatcatgcc 
ggagaagagataagtgagcttttccggtggctctcaggggtgtga 

>G904 Amino Acid Sequence (domain in AA coordinates: 117-158) 

MESLINPSHGGGNYDSHSSSLDSLKPSVLVIILIIjLMTLLISVSICFLIiRCIjIJRCSHRSV 

IiPLSSSSSVATVTSDSRRFSGHRVSPETERSSVLDSLPIFK^SSWRRSSSMNSGDCAVC 

LSKPEPEDQLRLLPLCCHAFHADCIDIWLVSNQTCPLCRSPLFASESDLMKSLAWGSNN 

GGGENSFRIiEIGSISRRRQTPIPESVEQHRTYSIGSFDYIVDDVDSEISESNFNRGKQED 

ATTTTATATAVTTNPTSFEASLAAD IGNDGSRS WLKDYVDRL SRG I S SRAMS FRS SGRFF 

TGSSRRSEELTVMDLEANHAGEEISEIiFRWLSGV* 

>G910 (1..1071) 

ATGTTATGTATAATAATAATTGAGAATATGGAAAGAGTATGTGAGTTTTGTAAAGCGTAT 
AGAGCAGTGGTTTATTGTATAGCTGATACAGCAAATCTTTGTTTAACATGTGATGCAAAG 
GTTCATTCAGCTAATTCACTCTCGGGACGGCATTTACGTACGGTTTTATGTGATTCTGGT 
AAGAATCAGCCTTGTGTTGTCCGATGTITTGACCATAAAATGTTTCTTTGCCATGGATGT 
AATGATAAGTTTCATGGTGGTGGCTCTTCTGAGCATCGTAGAAGGGATTTGAGGTGTTAT 
ACGGGTTGTCCTCCTGCTAAAGATTTCGCGGTTATGTGGGGTTTTCGAGTTATGGATGAC 
GATGATGATGTTTCGTTAGAGCAATCTTTTCGAATGGTTAAACCTAAGGTGCAAAGAGAA 
GGTGGTTTTATCTTGGAACAGATTCTTGAATTGGAGAAGGTTCAGCTCAGGGAAGAGAAT 
GGTAGTTCTTCCTTGACAGAACGAGGTGATCCATCTCCATTGGAGCTTCCTAAGAAACCC 
GAAGAACAGTTAATCGATCTTCCGCAGACCGGAAAAGAGCTGGTTGTTGATTTTTCACAC 
TTGTCCTCATCTTCCACACTTGGTGATTCCTTTTGGGAATGCAAAAGTCCATACAATAAG 
AACAATCAGTTGTGGCATCAAAATATACAAGACZATTGGAGTATGTGAAGATACAATCTGC 
AGTGACGATGACTTCCAAATACCTGACATTGATCTCACTTTCCGGAACTTTGAAGAGCAA 
TTTGGAGCTGATCCTGAGCCAATTGCAGATAGTAACAACGTGTTCTTTGTTTCTTCCCTT 
GACAAATCAGATGAGATGAAGACATTTTCTTCTTG^^ 

AAACCAGCTTCATCAACTATCTCATTCTCAAGCAGTGAAACCGATAACCCTTATAGTCAC 

TCAGAGGAAGTAATCTCATTTTGTCCCTCCCTCTCTAACAATACACGTCAAAAGGTCATC 

ACAAGGCTCAAGGAGAAGAAGAGAGCAAGAGTGGAGGAGAAAAAAGCTTAA 

>G910 Amino Acid Sequence (domain in AA coordinates : 14-37, 77-103) 

MLCI I IIENMERVCfiFCKAYRAVVYCIADTAlJLCLTCDAKVHSANSLSGRHIiRTVLCDSG 

KNQPCWRCFDHKMFLCHGOTOKFHG^ 

DDDVSLEQS FRMVKPKVQREGGFILEQILELEKVQLREENGSSSLTERGDPS PLELPKKP 
EEQLIDLPQTGKELVVDFSHLSSSSTLGDSFWECKSPYNKNNQLWHQNIQDIGVCEDTIC 
SDDDFQIPDIDLTFRNFEEQFGADPEPIADSNNVFFVSSLDKSHEMKTFSSSFNNPIFAP 
KPAS ST I S FSS SETDNPYSHS EEVI S FCPS LSl^TRQKVITRLKEKKRARVEEKKA* 
>G912 (20.. 694) 

CATCTTATCCAAAGAAAAAATGAATCC^TTTTACTCTACATTCCCAGACT 

AATCTCCGATCATAGATCTCCGGTTTCAGACAGTAGTGAGTGTTCACCAAAGTTAG 

AAGTTGTCCAAAGAAACGAGCTGGGAGGAAGAAGTTTCGTGAGACACGTCATCCGATTTA 
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CAGAGGAGTTCGTCAGAGGAATTCTGGTAAATGGGTTTGTGAAGTTAGAGAGCCTAATAA 

GAAATCTAGGATTTGGTTAGGTACTTTTCCGACGGTTGAAATGGCTGCTCGTGCTCATGA 

TGTTGCTGCTTTAGCTCTTCGTGGTCGCTCTGCTTGTCTCAATTTCGCTGATTCTGCTTG 

GCGGCTTCGTATTCCTGAGACTACTTGTCCTAAGGAGATTCAGAAAGCTGCGTCTGAAGC 

TGCAATGGCGTTTCAGAATGAGACTACGACGGAGGGATCTAAAACTGCGGCGGAGGCAGA 

GGAGGCGG CAGGGGAGG GGGTGAGGGAGGGGGAGAGGAGGGCGG AGGAGCAGAATGGTGG 

TGTGTTTTATATOGATGATGAGGCGCTTTTGGGGATGCCCAACTTTTTTGAGAAT 

GGAGGGGATGCTTTTGCCGCCGCCGGAAGTTGGCTGGAATCATAACGACTTTGACGGAGT 

GGGTGACGTGTCACTCTGGAGTTTTGACGAGTAATTTTTTGGCTCiU"*r'i*'rCTGGATAATA 

AGTT 

>G912 Amino Acid Sequence (domain in AA coordinates : 51-118) 

MNPFYSTFPDSFLSISDHRSPVSDSSECSPKLASSCPKKRAGRK^ 

NSGKWVCEVREPNKKSRIWLGTFPTVEMAARAHDVAAI^ 

TTCPKE I QKAASKAAMAFQNETTTEG S KTAAEAEEAAGEGVREGERRAEEQNGGVFYMDD 
EAIiLGMPNFFENMAEGMLLPPPE VGWNHND FDGVGD VSLWS FDE * 
>G920 (114.. 1154) 
AAAAAATCTATTTTCTTCTCTTTC 

ATACTAAAAACCTAAAAAAAGTTACATATTGATTGT^ 

CGAATAGTAACAACACGAAATCCATAAAGAGAAAAGTTGTCGACCAACTTGTCGAAGGCT 
ATGAATTCGCTACTC^GCTTCAGCTTCTCCT^ 

TCGATGAGACCCGTCTTGTTTCCGGGTCGGGTTCAGTTTCCGGTGGTCCAGATCCCGTTG 
ATGAGCTCATGTCTAAGATCTTGGGATCTCT^ 

TTGATCCCGTCGCCGTCTCTGTCCCCATCGCCGTCGAGGGTTCATGGAATGCTTCATGTG 

GGGATGATTCGGCGACTCCGGTGAGTTGCAACGGTGGAGATTCCGGTGAGAGTAAGAAGA 

AGAGATTAGGGGTTGGTAAGGGTAAAAGAGGATGCTACACTAGAAAGACGAGATCACATA 

CAAGGATCGTGGAAGCTAAAAGTTCTGAAGACAGATATGCTTGGAGGiU^TATGGAC^^ 

AGGAGATTCTTAATACCACATTCCCAAGAAGTTACT^ 

AAGGATGCAAAGCAACAAAGGAAGTTCAGAAACAGGATCAA^ 

TCACATACATTGGCT^ACCACACATGC^CTGCCAATGACCAAACGCACG 

CTTTTGATCAAGAAATCATTATGGATTCGGAAAAGACATTGGCTGCTAGCACTGCTCAGA 

ACCATGTCAATGCTATGGTGCAAGAGCAAGAGAAGAAGACGAGCAGTGTGACAGCAATAG 

ACGCAGGCATGGTTAAGGAGGAACAAAATAACAATGGTGATCAGAGTAAAGATTATTATG 

AGGGCTCTTCGACAGGTGAGGACTTGTCATTGGTTTGGCAAGAGACGATGATGTTTGATG 

ATCATCAAAATCACTACTATTGTGGTGAAACCAGTACTACTTCTCATCAATTTGGTTTCA 

TCGACAACGATGATCAGTTTTCCTCCTTCTTCGACTCATATTGTGCTGATTATGAAAGAA 

CAAGTGCTATGTGAACATCCAAATCTGGAATGATGAATCAGCACTAGGTCTTCTCTTTGA 

GTATGTCTAGTTTAATGTAATATTTTTGTTGTATGTTTGATAAAAACACCATATATACTT 

CTCTTTTTACACCAAAAAAAAAAAAAAAAAAAAAAA 

>G920 Amino Acid Sequence (domain in AA coordinates: 152-211) 
ITOSNSNNTKSIKRKVVDQIjVEGYEFATQIjQLIiLiSHQHSNQYHIDETRIjVSGSGSVSGGPD 
PVDELMSKILGSFHKTISVIjDSFDPVAVSVPIAVEGSWNASCGDDSATPVSCNGGDSGES 
KKKRLGVGKGKRGCYTRKTRSHTRIVEAKSSEDRYAW 

PTQGCKATKQVQKQDQDSEMFQITYIGYHTCTANDQTHAKTEPFDQEIIMDSEKTLAAST 
AQNHVNAMVQEQENNTS S VTAIDAGIWKEEQWNNGDQS KD YYEGS S TGEDLS LVWQETMM 
FDDHQNHYYCGETSTTSHQFGFIDNDDQFSSFFDSYCADYERTSAM* 
>G939 (9.. 1565) 

CAGATTCTATGGATATGTATAACAACAATATAGGGATGTTCCGGAGTTTAGTTTGTAGCT 

CGGCGCCTCCATTTACAGAGGGACATATGTGTTCTGATTCGCATACGGCTTTGTGCGATG 

ATCTGAGTAGTGATGAGGAAATGGAAATAGAGGAGCTTGAGAAGAAGATCTGGAGAGACA 

AGCAGCGTTTAAAGCGGCTCAAGGAAATGGCGAAGAACGGTCTAGGAACAAGATTGTTGT 

TGAAGCAGCAACATGATGATTTTCCAGAGCACTCTAGTAAGAGAACCATGTACAAGGCAC 

AAGATGGGATCTTGAAGTACATGTCGAAGACAATGGAGCGATATAAAGCTCAAGGTTTTG 

TTTATGGGATTGTGTTAGAGAATGGGAAAACGGTAGCGGGATCTTCTGATAATCTCCGTG 

AATGGTGGAAAGACAAAGTGAGGTTTGATAGGAACGGCCCAGCTGCTATAATCAAGCACC 

AAAGGGATATCAATCTTTCTGATGGAAGTGATTCAGGGTCTGAGGTTGGGGATTCTACCG 

CACAGAAGTTGCrTTGAGCTTCAAGATACTACTCTTGGAGCTCTGTTATCGGCTCTGTTO 

CTCACTGCAACCCTCCTGAGAGGCGGTTTCCGTTGGAGAAAGGCGTGACACCGCCATGGT 



139 



WO 03/013227 



140/286 



PCT7US02/25805 



GGCCAACGGGGAAAGAAGATTGGTGGGATCT^CTGTCTTTACCCGTTGATTTTCGAGGTG 
TTCCGCCACCTTACAAGAAGCCTCATGATCTCAAGAAGCTGTGGAAAATTGGTGTTTTGA 
TTGGTGTAAT(^GACATATGGCTTCTGAC!ATTAGCAACATACCCAATCTCGTGAGACGGT 
CTAGAAGTTTGCAGGAGAAAATGACGTCAAGAGAAGGCGCTTTATGGCTCGCTGCTCTTT 
ACCGAGAAAAGGCTATTGTTGATCAAATAGCCATGTCTAGAGAAAACAACAACACTTCTA 
ACTTTCTTGTTCCTGCAACCGGTGGAGACCCAGATGTTTTGTTTCCTGAATCTACAGACT 
ATGATGTTGAACTGATTGGTGGCZACTCATCGGACCAATCAGCAGTATCCTGAATTTGAAA 
AGAACTACAACTGTGTTTACAAGAGAAAGTTTGAAGAAGAT^ 

CAACACTCCTAACATGTGAGAACAGTCTCTGTCCTTATAGCCAACCACATATGGGATTTC 
TTGACAGGAACTTAAGAGAGAATCACCAAATGACTTGT 

ACCAACCAACTAAACCCTATGGTATGACGGGTTTAATGGTTCCTTGTCCGGATTATAACG 
GGATGCAGCAGGAGGTTCAGAGCTTTCAAG 

GACCAAAAGCTCCACAAAGAGGCAACGATGACTTGGTTGAGGATTTGAATCCTTCTCCTT 
CGACGCTGAATCAGAATCTTGGTTTAGTCTTACCTACTGACTTCAATGGAGGTGAGGAAA 
CAGTAGGAACAGAGAACAATC TG CATAATCAAGGGCAAGAGTTGCCCACATCTTGGATTC 
AGTAT^GAAAGCTTCAGAGTTTTCTTTTTATGTTTTCTAGTCTTTATAGCTTTGTCTCTT 
GCTTATTCTCTCATTAAACACAGTTTTTGATCTCTC^ 
GAGAAGATTAGGTTTCATAATAAGTTAATAACGAAATTCAAA ' 

>G93 9 Amino Acid Sequence (domain in AA coordinates: 97-106) 
MDMYNL^IGMFRSIiVCSSAPPFTEGHMCS 
LKRIiKEMAKNGLGTRLLLKQQHDDFPEHSSKRTMYKAQD 
IVLENGKTVAGSSDBTLREWWKDKVRFD^ 

LLELQDTTLGAIjLSALFPHCNPPQRRFPLEKGVTPPWWPTGKEDWWDQLSLPVDFRGVPP 
PYKKPHDLKKLWKIGVXiIGVIRHMASD I SNIPNIjVRRSRSLQEKMTSREGALWLAAIjYRE 
KAIVDQIAMSRENNNTSNFLVPATGGDPDVLFPESTDTO 

NCVYKRKFEEDFGMPMHPTLLTCENSLCPYSQPHMGFLDRNLRENHQMTCPYKVTSFYQP 
TKPYGMTGLIWPCPDYNGMQQQVQSFQDQFNHPNDLYRPKAPQRGNBDLVED 
NQNLGLVLPTDFNGGEETVGTENNIjHNQGQELPTSWIQ* 
>G963 (1..897) 

ATGAGTTTGCCTCCAGGATTCAGGTTTCATCCCACTGATGAAGAACTGGTGGCTTACTAT 

CTTGATAGGAAGGTCAACGGCCAAGCCATTGAGCTCGAGATGATCCCAGAAGTTGATCT^ 

TATAAATGCGAGCCATGGGACTTGCCTGAAAAGTC^TTTTTGCCGGGAAACGACATGGAA 

TGGTACTTTTACAGCACAAGGGATAAGAAGTATCCAAATGGCTCTAGGACGAACCGTGCG 

ACCCGAGCGGGTTACTGGAAGGCCACGGGGAAAGATCGTACAGTAGAATCAAAGAAGATG 

AAGATGGGAATGAAGAAGACACTGGTTTATTATAGAGGAAGGGCTCCTCATGGCCTTCGT 

ACTAATTGGGTCATGCATGAATATCGTCTCACGCACGCTCCTTCCTCCTCCTTGAAGGAG 

TCGTATGCATTGTGCCGAGTGTTTAAGAAGAACATACAAATTCCAAAGAGAAAAGGGGAA 

GAAGAAGAAGCAGAAGAAGAGAGCACTAGTGTAGGAAAAGAAGAGGAAGAAGAAAAGGAG 

AAGAAGTGGAGAAAATGTGATGGTAATTATATTGAAGACGAGAGCTTGAAAAGAGCATCC 

G CGGAGACATCTT CATCAGAG CTAACTCAAGGGGT G CTTTTAGACGAAGCAAACAGCTCA 

TCCATATTTGCTCTTCATTTCTCATCTTCTCTTCTGGACGATCA.TGATCATCTTTTCTCA 

AACTATTCTCATCAGCTTCCATATCATCCTCCTCTTCAACTCCAAGATTTCCCTCAACTT 

TCTATGAACGAAGCAGAGATTATGTCAATCCAACAAGACTTTCAATGCAGAGACTCTATG 

AACGGGACACTTGACGAAATCTTCTCTTCTTCCGCCACTTTCCCCGCTTCCCTTTGA 

>G963 Amino Acid Sequence (domain in AA coordinates: TBD) 

MSLPPGFRFHPTDEELVAYYIiDRKVNGQAIELEI I PEVDLYKCEPWDLPEKS FLPGNDME 

WYFYSTRDKKYPNGSRTNRATRAGYWKATGK^ 

TNWVMHEYRXiTHAPSSSLKESYALCRVFKKNIQIPK^ 

KKWRKCDGNYIEDESLKRASAETSSSEIiTQG 

NYSHQLPYHPPLQLQDFPQLSMNEAEIMSIQQDFQCRDSMNGTLDEIFSSSATFPASI** 
>G979 (60.. 1352) 

CCTCTGAGGAATCAAATCACTGACACTCCAAAAAAAAA 

TGAAGAAGCGCTTAACCACTTCCACTTGTTCTTCTTCTCCATCTTCCTCTGTTTCTTCTT 
CTACTACTACTTCCTCTCCTATTCAGTCGGAGGCTCCAAGGCCTAAACGAGCCAAAAGGG 
CTAAGAAATCTTCTCCTTCTGGTGATAAATCTCATAACCCGACAAGCCCTCCTTCTACCC 
GACGCAGCTCTATCTACAGAGGAGTCACTAGACATAGATGGACTGGGAGATTCGAGGCTC 
ATCTTTGGGACAAAAGCTCTTGGAATTCGATTCAGAACAAGAAAGGCAAACAAGTTTATC 
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TGGGAGCATATGACAGTGAAGAAGCAGCAGCACATACGTACGATCTGGCTGCTCTCAAGT 
ACTGGGGACCCGACACCATCTTGAATTTTCCGGCAGAGACGTACACAAAGGAATTGGAAG 
AAATGCAGAGAGTGACAAAGGAAGAATATTTGGCTTCTCTCCGCCGCCAGAGCAGTGGTT 
TCTCCAGAGGCGTCTCTAAATATCGCGGCGTCGCTAGGCATCACCACAACGGAAGATGGG 
AGGCTCGGATCGGAAGAGTGTTTGGGAAC^^GTACTTGTACCrCGGCACCTATAATACGC 
AGGAGGAAGCTGCTGCAGCATATGACATGGCTGCGATTGAGTATCGAGGCGCAAACGCGG 
TTACTAATTTCGACATTAGTAATTACATTGACCGGTTAAAGAAGAAAGGTGTTTTCCCGT 
TCCCTGTGAACCAAGCTAACC^TCAAGAGGGTATTCTTGTTGAAGCCAAACAAGAAGTTC 
AAACGAGAGAAGCGAAGGAAGAGCCTAGAGAAGAAGTGAAACAACAGTACGTGGAAGAAC 
CACCGCAAGAAGAAGAAGAGAAGGAAGAAGAGAAAGCAGAGCAAC^ 

TAGGATATTCAGAAGAAGCAGCAGTGGTCAATTGCTGCATAGACTCTTCAACCATAATGG 

AAATGGATCGTTGTGGGGACAACAATGAGCTGGCTTGGAACTTCTGTATGATGGATACAG 

GGTTTTCTCCGTTTTTGACTGATCAGAATCTCGCGAATGAGAATCCCATAGAGTATCCGG 

AGCTATTCAATGAGTTAGCATTTGAGGAGAA.CATCGACTTCATGTTCGATGATGGGAAGC 

ACGAGTGCTTGAACTTGGAAAATCTGGATTGTTGCGTGGTGGGAAGAGAGAGCCCACCCT 

CTTCTTCTTGACCATTGTCTTGCTO 

CAACCTCGGTTTCTTGTAACTATTTGGTCTGA 

TTTCTATTTCTTCCGCTTCTTCOT 

TATTTCAGTTTCAGGGCTTGTTCGTTGGTTCTGAATAATCAATGTCTTTGCCCCTTTTNN 
AANGNTNCAAGWTNAAANAAAAAAAAAAAA 

>G979 Amino Acid Sequence (domain in AA coordinates: 63-139,165-233) 
MKKRLTTSTCSSSPSSSVSSSTTTSSPIQSEAPR 

RRSS IYRGVTRHRWTGRFEAHIjWDKSSWNS IQNKKGKQVYLGAYDSEEAAAHTYDLAAIjK 

YWGPDTILNFPAETYTKELEEMQROTKEEYriASLRRQSSGF 

EARIGRVFGNKYIiYIjGTYNTQEEAAAAYDM^ 

FPVNQAlfflQEGILVEAKQEVETREAKEEPREEVKQQYVEEPPQEEEEKEEEKAEQQEAEI 

VGYSEEAAVWCCIDSSTIMEMDRdGDN^ 

ELFNELAFEDNIDFMFCDGKHECIJTLENLDCCVVGRE 

TTSVSCNYLV* 

>G987 (1..4011) 

ATGGGTTCTTACTCAGCTGGCTTCCCTGGATCCTTGGACTGGTTTGATTTTCCCGGTTTA 
GGAAACGGATCCTATCTAAATGATCAACCTTTGTTAGATATTGGATCTGTTCCTCCTCCT 
CTAGACCCATATCCTCAACAGAATCTTGCTTCTGCGGATGCTGATTTCTCTGATTCTGTT 
TTGAAGTACATAAGCCAAGTTCTTATGGAAGAGGACATGGAAGATAAGCCTTGTATGTTT 
CATGATGCTTTATCTCTTCAAGCAGCTGAGAAGTCTCTCTATGAAGCTCTCGGCGAGAAG 
TACCCGGTTGATGATTCTGATCAGCCTCTGACTACTACTACTAGCCTTGCTCAATTGGTT 
AGTAGTCCTGGTGGTTCTTCTTATGCTTCAAGCACCACAACCACTTCCTCTGATTCACAA 
TGGAGTTTTGATTGTTTGGAGAATAATAGGCCTTCTTCTTGGTTGCAGACACCGATCCCG 
AGTAACTTCATTTTTCAGTCTACATCTACTAGAGCCAGTAGCGGTAACGCGGTTTTCGGG 
TCAAGTTTTAGCGGTGATTTGGTTTCTAATATGTTTTU^TGATACTGACTTGGCGTTACAA 
TTCAAGAAAGGGATGGAGGAAGCTAGTAAATTCCTTCCTAAGAGCTCTCAGTTGGTTATA 
GATAACTCTGTTCCTTAACAGATTAACCGGAAAGAAGAGCCATTGGCGCGAAGAAGAACAT 
TTGACTGAAGAAAGAAGTAAGAAACAATCTGCTATTTATGTTGATGAAACTGATGAGCTT 
ACTGATATGTTTGACAATATTCTGATATTTGGCGAGGCTAAGGAACAACCTGTATGCATT 
CTTAACGAGAGTTTCCCTAAGGAACCTGCGAAAGCTTCAACGTTTAGTAAGAGTCCTAAA 
GGCGAAAAACCGGAAGCTAGTGGTAACAGTTATACAAAAGAGACACCTGATTTGAGGACA 
ATGCTGGTTTCTTGTGCTCAAGCTGTTTCGATTAACGATCGTAGAACTGCTGACGAGCTG 
TTAAGTCGGATAAGQCAACATTCTTCATCTTACGGCGATGGT^ACAGAGAGATTGGCTCAT 
TATTTTGCTAACAGTCTTGAAGCACGTTTGGCTGGGATAGGTACACAGGTTTATACTGCC 
TTGTCTTCCAAGAAAAC^TCrACTTCT 

GTCTGTCCGTTCAAGAAAATCGCAATCATATTCGCCAACCATAGTATTATGCGGTTGGCT 
TCAAGTGCTAATGCCAAAACC^TCCACATCATAGATTTTGGAATATCTGATGGTTTCCAG 
TGGCCTTCTCTGATTCATCGACTTGCTTGGAGACGTGGTTCATCTTGTAAGCTTCGGATA 
ACCGGTATAGAGTTGCCTCAACGTGGTTTTAGACCAGCCGAGGGAGTTATTGAGACTGGT 
CGTCX3CTTGGCTAAGTATTGTCAGAAGTTCAATATTCCGTTTGAGTACAATGCGATTGCG 
CAGAAATGGGAATCAATCAAGTTGGAGG ACTTGAAG CTAAAAGAAGG CGAGTTTGTTG CG 
GTAAACTCTTTATTTCGGTTTAGGAATCTTCTAGATGAGACGGTGGCAGTGCATAGCCCG 
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AGAGATACGGTTTTGAAGCTGATAAGGAAGATAAAGCCAGACGTGTTCATCCCCGGGATC 
CTCAGCGGATCCTACAACGCGCCTTTCTTTGTCACGAGGTTTAGAGAAGTTCTGTTTCAT 
TACTCATCTCTGTTTGACATGTGTGACACGAATCTAACACGGGAAGATCCAATGAGGGTT 
ATGTTTGAGAAAGAGTTCTATGGGCGGGAGATCATGAACGTGGTGGCGTGTGAGGGGACG 
GAGAGAGTGGAGAGGCCAGAGAGTTATAAGCAGTGGCAGGCGAGGGCGATGAGAGCCGGG 
TTTAGACAGATTCCGCTGGAGAAGGAACTAGTTCAGA^CTGAAGTTGATGGTGGAAAGT 
GGATACAAACCCAAAGAGTTTGATGTTGATG 

AAAGGTAGAATTGTATACGGTTCATCTATTTGGGTTCCTTTCTTTTTCTATGTGGGCAGA 
GCAACTAGGGTTTTGATCATGGATCCAAAC 

TTTGATGGTAACCCTAATTTGCTTACTGATCCAATGGAAGATCAGTATCCACCACCATCT 
GATACTCTGTTGAAATACGTGAGTGAGATT6TTATGGAAGAGAGTAATGGAGATTATAAG 
CAATCTATGTTCTATGATTCATTGGCTTTACGAAAAACTGAAGAAATGTTGCAGCAAGTC 
ATTACTGATTCTCAAAATCAGTCCTTTA^ 

GATGCAAGCGGAAGC^TCGATGAATCGGCnrrATTCGGCrGATCCGCAACCT 

ATTATGGTTAAGAGTATGTTTAGTGATGCAGAATCAGCTTTACAGTTTAAGAAAGGGGTT 

GAAGAAGCTAGTAAATTCCTTCCCAATAGTGATCAATGGGTTATCAATCTGGATATCGAG 

AGATCCGAAAGGCGCGATTCGGTTAAAGAAGAGATGGGATTGGATCAGTTGAGAGTTAAG 

AAGAATCATGAAAGGGATTTTGAGGAA.GTTAGGAGTAGTAAGCAATTTGCTAGTAATGTA 

GAAGATAGTAAGGTTACAGATATGTTTGATAAGGTTTTGCTTCTTGACGGTGAATGCGAT 

CCGCAAAGATTGTTAGACAGCGAGATTCAAGCGATTCGGAGTAGTAAGAACATAGGAGAG 

AAAGGGAAGAAGAAGAAGAAGAAGAAGAGTCAAGTGGTTGATTTTCGTACACTTCTCACT 

CATTGTGCACAAGCCATTTCCACAGGAGATAAAACCACGGCTCTTGAGTTTCTGTTACAG 

ATAAGGCAACAGTCTTCGCCTCTCGGTGACGCGGGGCAAAGACTAGCTCATTGTTTCGCT 

AACGCGCTTGAAGCTCGTCTACAGGGAAGTACCGGTCCTATGATCCAGACTTATTACAAT 

GCTTTAACCTCGTCGTTGAAGGATACTGCTGCGGATACAATTAGAGCGTATCGAGTTTAT 

CTTTCTTCGTCTCCGTTTGTTACCTTGATGTATTTCTTCTCCATCTGGATGATTCTTGAT 

GTGGCTA^GATGCTCCTGTTCTTCATATAGTTGATTTTGGGATTCTATACGGGTTTCAA 

TGGCCGATGTTTATTCAGTCTATATCAGATCGAAAAGATGTACCGCGGAAGCTGCGGATT 

ACTGGTATCGAGCTTCCTCAGTGCGGGTTTCGGCCCGCGGAGCGAATAGAGGAGACAGGA 

CGGAGATTGGCTGAGTATTGTAAACGGTTTAATGTTCCGTTTGAGTACAAAGCCATTGCG 

TCTCAGAACTGGGAAACAATCCGGATAGAAGATCTCGATATACGACCAAACGAAGTCTTA 

GCGGTTAATGCTGGACTTAGACTCAAGAACCTTCAAGATGAAACAGGAAGCGAAGAGAAT 

TGCCCGAGAGATGCTGTCTTGAAGCTAATAAGAAACATGAACCCGGACGTTTTCATCCAC 

GCGATTGTCAA.CGGTTCATTCAACGCACCCTTCTTTATCTCGCGGTTTAAAGAAGCGGTT 

TACCATTACTCCGCTCTCTTCGACATGTTTGATTCGACGTTGCCTCGGGATAACAAAGAG 

AGGATTAGGTTCGAGAGGGAGTTTTACGGGAGAGAGGCTATGAACGTGATAGCGTGCGAG 

GAAGCTGATCGAGTGGAGAGGCCTGAGACTTACAGGCAATGGCAGGTTAGAATGGTTAGA 

GCCGGGTTTAAGCAGAAAACGATTAAGCCTGAGCTGGTAGAGTTGTTTAGAGGAAAGCTG 

AAGAAATGGCGTTACCATAAAGACTTTGTGGTTGATGAAAATAGTAAATGGTTGTTACAA 

GGCTGGAAAGGTCGAACTCTCTATGCTTCTTCTTGTTGGGTTCCTGCCTAG 

>G987 Amino Acid Sequence (domain in AA coordinates: 428-432,704-708) 

MGSYSAGFPGSLDWFPFPGIiGNGSYIiNDQPIiLDIGSVPPPLDPYPQQNLASADADFSDSV 

LKYISQVLMEEDMEDKPC^HDALSLQAAEKSL^ 

SSPGGSSYASSTTTTSSDSQWSFDCIjENNRPSSWLQTPIPSNFIFQSTSTRASSGNAVFG 

s s fsgdlvsnmfndtdlalqfkkgmeeas kflpks sqlvidnsvpnrltgkkshwreeeh 
lteerskkqsaiyvdetdeltdmfdnilifgeakeqpvcilnesfpkepakastfskspk 

GEKPEASGNSYTKETPDLRTMLVSCAQAVS INDRRTADELLSR IRQHSS S YGDGTERIjAH 
YFANSLEARLAGIGTQVYTALSSKECTSTSDMIjKAYQTYISVCPFKKIAIIFANHSIMRIA 
SSAIIAKTIHIIDFGISDGFQWPSLIHRIjAWRI^ 

RRLAKYCQKFNIPFEYNAIAQKWES IKLEDLKLKEGEFVAVNSLFRFRNLIiDETVAVHSP 
RDTVLKLIRKIKPDVFIPGILSGSYHAPFFVTRFREV^ 

mfekefygreimnwacegterverpesykqwqaramragfrqiplekelvqklklk^ 

GYKPKEFDVDQDCHWLIjQGWKGRIVYGSSIWVPFFFYVGRATRVLIMDPNFSESIjNGFEY 
FDGI^I^LTDPMEDQYPPPSDTLIjKYVSEILMEESNGDYKQSMFYDSIiAIiRKTEEMLQQV 

itdsqnqsfspadslitnswdasgsidesaysadpqpvneimvksmfsdaesalqfkkgv 

EKASKFIiPNSDQWVINLDIERSERRDSVKEEMGLDQLRVKKNHERDFEEVRSSKQFASOT 

edskvtdmfdkvllldgecdpqtlldseiqairss 
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HCAQAISTGDKTTALEFLLQIRQQSSPLGDAGQRLAHCFANALBARLQGSTGPMIQTYYN 
AIiTSSLKI)TAADTIRAYRVYL^ 

WPMFIQSISDRIODVPRKLRITGIELPQCGFRPAERIBETGRRIiAEYCKRFNVPFEYKAIA 

S QNWET IRI BDLD I RPNEVLAVWAGIjRLKNLQDETG S EENCPRDAVLKL IRNMNPDVFIH 

AIVNGSFNAPFFISRFKEAVYHYSALFDMFDSTLPRDNKERIRFEREFYGREAMNV 

EADRVERPETYRQWQVRMVRAGFKQKTIKPELVELFRGKLKK^ 

GWKGRTLYASSCWVPA* 

>G993 (6.. 1091) 

CAAATATGGAATACAGCTGTGTAGACGACAGTAGTACAACGTCAGAATCTCTCTCCATCT 

CTACTACTCCAAAGCCGACAACGACGACGGAGAAGAAACTCTCTTCTCCGCCGGCGACGT 

CGATGCGTCTCTACAGAATGGGAAGCGGCGGAAGCAGCGTCGTTTTGGATTCAGAGAACG 

GCGTCGAGACCGAGTCACGTAAGCTTCCTTCGTCGAAATATAAAGGCGTTGTGCCTCAGC 

CTAACGGAAGATGGGGAGCTCAGATTTACGAGAAGCATCAGCGAGTTTGGCTCGGTACTT 

TCAACGAGGAAGAAGAAGCTGCGTCTTCTTACGACATCGCCGTGAGGAGATTCCGCGGCC 

GCGACGCCGTCACTAACTTCAAATCTCAAGTTGATGGAAACGACGCCGAATCGGCTTTTC 

TTGACGCTCATTCTAAAGCTGAGATCGTGGATATGTTGAGGAAACACACTTACGCCGATG 

AGTTTGAGCAGAGTAGACGGAAGTTTGTTAACGGCGACGGAAAACGCTCTGGGTTGGAGA 

CGGCGACGTACGGAAACGACGCTGTTTTGAGAGCGCGTGAGGTTTTGTTCGAGAAGACTG 

TTACGCCGAGCGACGTCGGGAAGCTGAACCGTTTAGTGATACCGAAACAACACGCGGAGA 

AGCATTTTCCGTTACCGGCGATGACGACGGCGATGGGGATGAATCCGTCTCCGACGAAAG 

GCGTTTTGATTAACTTGGAAGATAGAACAGGGAAAGTGTGGCGGTTCCGTTACAGTTACT 

GGAACAGCAGTCAAAGTTACGTGTTGACCAAGGGCTGGAGCCGGTTCGTTAAAGAGAAGA 

ATCTTCGAGCCGGTGATGTGGTTTGTTTCGAGAGATCAACCGGACCAGACCGGCAATTGT 

ATATCCACTGGAAAGTCCGGTCTAGTCCGGTTCAGACTGTGGTTAGGCTATTCGGAGTCA 

ACATTTTCAATGTGAGTAACGAGAAACCAAACGACGTCGCAGTAGAGTGTGTTGGCAAGA 

AGAGATCTCGGGAAGATGATTTGTTTTCGTTAGGGTGTTCCAAGAAG(^GGCGATTATCA 

ACATCTTGTGACAAATTCTTTTTTO 

ATATTTTGTATTGAAATGACAAGTTGTAAATTAGGACAAGACAAGAAAAAATGACAACTA 
GACAAAATAGTTTTTGTTTAAAAAAAAAAAAAAAAAAAA 

>G993 Amino Acid Sequence (domain in AA coordinates: 69-134) 
MEYSCTODSSTTSESLSISTTPKPTTTTEKKLSSPPATSMRLYRMGSGGS 
ETESRKLPS SKYKGVVPQPNGRWGAQI YEKHQRVWLGTFNEEEEAAS SYDIAVRRFRGRD 
AVTNFKS QVDGNDAE S AFLD AHS KAE I VDMIiRKHT YADB FEQ S RRKFVNGDG KRS GLETA 
TYGNDAVLRAREVLFEKTVTPSDVGKLNI^ 

IilNLEDRTGKVWRFRYSYV^SSQSYVLTKGWSRFVKEKNLRAGDVVCFERSTGPDRQLYI 
HWKVRS S PVQTWRLFGVNI FNVSNEKPNDVAVECV GKECRSREDDLFS LGCS KKQAI INI 
I>* 

>G681 (1..804) 

ATGGGGAGGACGACATGGTTCGACGTCGACGGGATGAAGAAAGGAGAGTGGACGGCAGAG 

GAAGACCAGAAGCTCGGCGCTTACATCAACGAGCATGGCGTTTGTGATTGGCGTTCCCTC 

CCCAAAAGAGCTGGTTTGCAGAGATGTGGAAAGAGCTGCAGATTAAGGTGGCTTAACTAT 

CTAAAGCCTGGGATTAGAAGAGGCAAATTCACTCCTCAAGAAGAAGAAGAAATCATCCAA 

CTTCATGCTGTTCTCGGAAACAGGTGGGCAGCCATGGCGAAGAAGATGCAGAATCGAACA 

GACAATGATATCAAGAACC^TTGGAACTCTTGTCTCAAGAAAAGACTTTCGAGAAAGGGA 

ATCGACCCTATGACCCACGAGCCCATCATCAAACACCTCACCGTCAATACCACTAACGCA 

GATTGTGGTAACTCTTCCACCACGACGTCCCCGTCGACGACGGAAAGCTCTCCTTCCTCC 

GGCTCGTCTCGTCTTCTTAACAAACTCGCCGCAGGTATCTCATCTAGACAACATAGTCTC 

GATAGGATCAAGTAeATCTTGTCGAATTCAATAATCGAAAGCAGTGATCAAGCAAAAGAG 

GAAGAAGAAAAAGAAGAAGAAGAAGAAGAAAGAGATTCAATGATGGGTCAGAAGATTGAC 

GGTAGTGAAGGAGAAGATATTCAGATTTGGGGCGAGGAGGAAGTTAGGCGTTTAATGGAG 

ATTGATGCAATGGATATGTACGAGATGACTTCGTACGACGCTGTCATGTACGAGAGTAGT 

CACATACTTGATCATCTCTTTTGACTTAATATAGTGTGACTGTGTGAGTGCATGCATGTT 

>G681 Amino Acid Sequence (domain in AA coordinates : 14-120) 

MGRTTWFDVDGMKKGEWTAEEDQKLGAY INEHGVCDWRSIiPKRAGLQRCGKS CRJjRWLNY 

LKPG IRRGKFTPQEEEE I IQLHAVLGNRWAAMAKKMQNRTDND I KNHWNS CLKKRIjSRKG 

IDPMTHEPIIKHLTVNTTNADCGNSSTTTSPSTTESSPSSGSSRIiI 

DRIKYILSNSIIESSDQAKEEEEKEEEEEERDSMMGQKIDGSEGEDIQIWGEEEVRRLME 
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IDAMDMYEMTSYDAVMYESSHILDHLF* 
>G1482 (1..996) 

ATGAAGATCAGGTGCGACGTCTGCGATARAGAAGAAGCGTCGGTGTTTTGCACGGCCGAC 

GAAGCATCTCTCTGCGGCGGCTGCX3ACCACCAAGTCCACC^CGCTAACAAACTCGCCTCT 

AAACATCTCCGTTTCTCTCTCCTTTATCCTTCTTCTTCCAACACCTCCTCTCCTCTCTGC 

GACATCTGTCAGGATAAAAAAGCTCTGTTGTTCTGTCAACAAGATAGAGCTATTTTATGC 

AAAGATTGCGATTCATCGATCCACGCTGCGAACGAACACACAAAGAAACACGATAGGTTT 

CTTCTTACAGGGGTTAAGCTCTCTGCAAC^TCGTCTGTTTACAAACCTACTTCGAAATCT 

TCTTCTTCTTCTTCAAGCAACCAAGATTTCTCTGTCCCTGGATCATC^U^TCTCTAATCCT 

CCTCCTCTCAAGAAACCTCTCrrCAGCTCCTCCTC^GAGC^CAAGATCC^CCCTTTTCG 

AAGATCAACGGCGGTGATGCGTCGGTGAATCAGTGGGGATCCACAAGCACGATTTCTGAG 

TATTTGATGGATACGTTACCTGGTTGGCACGTTGAGGATTTCCTCGATTCCTCTCTTCCT 

ACTTATGGTTTCTCTAAGAGTGGTGATGATGATGGAGTGTTACCATATATGGAACCAGAA 

GATGACAACAACACTAAGAGAAACAACAACAACAACAACAACAACAAC^^ 

TCACTTCCATCTAAGAATTTAGGGATTTGGGTCCCTCAGATTCCACAAACTCTTCCTTCT 

TCATACCCAAATCAATACTTTTCTCAAGACAAC^UVCATACAGTTTGGGATGTACAACAAA 

GAAACATCACCAGAAGTAGTGTCTTTTGCTCCAATACAAAACATGi^ 

AACAACAAGAGATGGTATGATGATGGTGGCTTCACTGTCCCACAGATCACTCCTCCTCCT 
CTTTCTTCTAATAAAAAGTTTAGATCTTTCTGGTAA 

>G1482 Amino Acid Sequence (domain in aa coordinates: 5-63) 
MKXRCDVCDKEEAS VFCTADEASLCGGCDHQVHHANKIiAS KHLRFSLL YPS S SNTS S PLC 
DICQDK3CALLFCQQDRAILCKDCDSSIHAANEHTK 

SSSSSSNQDFSVPGSSISNPPPDKKPLSAPPQSNKIQPFSKINGGDASVNQWGSTSTISB 
YLMDTLPGWHVEDFLIDSSIiPTYGFSKSGDDDGVLPYM^ 

SLPSKNLGIWVPQIPQTIiPSSYPNQYFSQDNNIQFGMYNKETSPEWSFAPIQ^ 

NNKRWYDDGGFTVPQITPPPLSSNKKFRSFW* 

>G225 (157.. 441) 

CTCTCTCTCTCACTCTTTTCTTTT 

TTCCCCTCGTGAGGAAATC^TTTCT 

CTTTCTCTGTGTGTTTCGTGTCTTCAGATTAGTTCGATGTTTCGTTCAGACAAGGCGGAA 

AAAATGGATAAACGACGACGGAGACAGAGCAAAGCCAAGGCTTCTTGTTCCGAAGAGGTG 

AGTAGTATCGAATGGGAAGCTGTGAAGATGTCAGAAGAAGAAGAAGATCTCATTTCTCGG 

ATGTATAAACTCGTTGGCGACAGGTGGGAGTTGATCGCCGGAAGGATCCCGGGACGGACG 

CCGGAGGAGATAGAGAGATATTGGCTTATGAAACACGGCGTCGTTTTTGCCAACAGACGA 

AGAGACTTTTTTAGGAAATGATTTTTTTTGTTTGGATTAAAAGAAAATTTTCCTCTCCTT 

AATTCACAAGACAAGAAAAAAAGGAAATGTACCTGTCCTTGAATTACTATTTTGGAATGT 

ATAATTATCTATATATATAAGAAGAAAAAATTG CTTAGGAATTT 

>G225 Amino Acid Sequence (domain in AA coordinates: 39-76) 

MFRSDKAEKMDKRRRRQS KAKASCSEEVSS I EWEAVKMSEEEEDL I SRMYKLVGDRWELI 

AGRIPGRTPEEIERYWLMKHGWFANRRRDFFRK* 

>G226 (10.. 348) 

CCAGTAGTTATGGATAATACCAACCGTCTTCGTCTTCGTCGCGGTCCCAGTCTTAGGC?^A 

ACTAAGTTCACTCGATCCCGATATGACTCTGAAGAAGTGAGTAGCATCGAATGGGAGTTT 

ATCAGTATGACCGAACAAGAAGAAGATCTCATCTCTCGAATGTACAGACTTGTCGGTAAT 

AGGTGGGATTTAATAGCAGGAAGAGTCGTAGGAAGAAAGGGZUVATGAGATTGAGAGATAC 

TGGATTATGAGAAACTCTGACTATTTTTCTCACAAACGACGACGTCTTAATAATTCTCCC 

TTTTTTTCTACTTCTCCTCTTAATCTCCAAGAAAATCTAAAATTGTAAAGAAATCAAAAT 

AAAAGCTTTCAATCATAAAAGTAGAACAAATCTTGAATGTCTTCTCA 

>G226 Amino Acid Sequence (domain in AA coordinates: 28-78) 

MDNTI^LRXjRRGPSLRQTKFTRSRYDSEEVSSIEWEFISMTEQEEDIjISRMYRLVGNRWD 

LIAGRWGRKT^EIERYWIMRNSDYFSHKRRRIiNN^ 

>G9 (81. .1139) 

GTGTTTCTTCTTTCTGCTAAAAGGTTATAATTTTTGTTTCTTGGTTTGGTGAGAATCTTC 
AAGAAACTGAAACAAAGAAAATGGATTCTAGTTGCATAGACGAGATAAGTTCCTCC^CTT 
CAGAATCTTTCTCCGCCACCACCGCCAAGAAGCTCTCTCCTCCTCCCGCGGCGGCGTTAC 
GCCTCTACCGGATGGGAAGCGGCGGGAGCAGCGTCGTGTTGGATCCCGAGAACGGCCTAG 
AGACGGAGTCACGAAAGCTACCATCTTCAAAATACAAAGGTGTTGTTCCTCAGCCTAACG 
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GAAGATGGGGAGCTCAGATCTACGAGAAGCACCAACGAGTATGGCTCGGGACTTTCAACG 

AGCAAGAAGAAGCTGCTCGTTCCTACGACATCGCAGCTTGTAGATTCCGTGGCCGCGACG 

CCGTCGTCAACTTCAAGAACGTTCTGGAAGACGGCGATTTAGCTTTTCTTGAAGCTC^CT 

CAAAGGCCGAGATCGTCGACATGTTGAGAAAACACACTTACGCCGACGAGCTTGAACAGA 

ACAATAAACGGCAGTTGTTTCTCTCCGTCGACGCTAACGGAAAACGTAACGGATCGAGTA 

CTACTCAAAACGAC^AAGTTTTAAAGACGTGTGAAGTTCTTTTCGAGAAGGCTC 

CTAGCGACGTTGGGAAGCTAAACCGTCTCGTGATACCTAAACAACACGCCGAGAAACACT 

TTCCGTTACCGTCACCGTCACCGGC^GTGACTAAAGGAGTTTTGATCAACTTCGAAGACG 

TTAACGGTAAAGTGTGGAGGTTCCGTTACTCATACTGGAACAGTAGTCAAAGTTACGTGT 

TGACCAAGGGATGGAGTCGATTCGTCAAGGAGAAGAATCTTCGAGCCGGTGATGTTGTTA 

CTTTCGAGAGATCGACCGGACTAGAGCGGCAGTTATATATTGATTGGAAAGTTCGGTCTG 

GTCCGAGAGAAAACCCGGTTCAGGTGGTGGTTCGGCTTTTCGGAGTTGATATCTTTAATG 

TGACCACCGTGAAGCCAAACGACGTCGTGGCCGTTTGCGGTGGAAAGAGATCTCGAGATG 

TTGATGATATGTTTGCGTTACGGTGTTCCAAGAAGCAGGCGATAATCAATGCTTTGTGAC 

ATATTTCCTTTTCCGATTTTATGCTTTCGTTTTTTAATTTTTTT 

AGGTTGTGATTCATGCTAGGTTGTATTTAGGAAAAGAGATAAGACC 

>G9 Amino Acid Sequence (domain in AA coordinates: 62-127) 

MDSSCIDEISSSTSESFSATTAKKLSPPPAAAIiiaiYRMGSGGSSVVLDPENGLETESRKL 

PS S KYKG WPQPNGRWG AQ I YE KHQRVWLGTFNEQEEAARS YD IAACRFRGRDAWNFKN 

VI1EDGDI1AFLEAHS KAE I VDMIiRKHTYADELEQNNKRQLFLS VDANGKRNGS STTQNDKV 

LKTCEVLFEKAVTPSDVGKLNRLVIPKQHAEKHFPLPSPSPAVTKG 

FRYS YWNS SQS YVLiTKGWS RFVKEKNLRAGDVVTFERSTGIiERQLY IDWKVRSGPRENPV 
QVWRLFGVD I FNVTTVKPOT) VVAVCGGKRS RDVDDMFALRCS KKQAI INAL* 
>G1040 (51.. 863) 

CTTTGATCTCCACTATTTAAGTAGACAAGAATCATAAAGAAAATAGTGAGATGATGATGT 

TAGAGTCAAGAAACAGTATGAGAGCTTCAAACTCAGTCCCAGATCTGTCTCTTCAGATCA 

GTCTTCCTAACTATCACGCCGGAAAACCTCTTCACGGCGGTGACCGGAGCTCCACAAGCA 

GTGATTCTGGAAGGAGCCTCAGTGACCTGAGCCATC^ 

TCTTGAGCTTAGGATTTGACCATCATGATCAAAGGCGCT 

TCTACGGTCGAGATTTCAAGAGAAGCTCATCATCAATGGTTGGTCTTAAACGAAGCATTC 
GTGCTCCAAGAATGAGATGGACTTCTACTCTTC^ 

TTCTTGGCGGCCATGAAAGAGCAACGCCTAAATCAGTGTTGGAGCTCATGAATGTGAAGG 
ATCTAACCCTAGCTC^TGTCAAGAGTCACTTGCAGATGTATAGAACAGTGAAATGCACTG 
ATAAAGGATCACCAGGAGAAGGAAAGGTAGAGAAAGAGGCAGAGCAGAGGATAGAGGACA 
ATAATAATAATGAAGAAGCTGATGAAGGAACTGACACAAATTCGCCAAACTCATCATCTG 
TGCAAAAGACCCAAAGAGCTTCATGGTCATCGACAAAGGAAGTATCTAGGAGCATATCTA 
CACAAGCATATTCTCACTTGGGAACAACTCATC^^ 

ATACCAACATTCATCTCAATTTGGATTTCACATTGGGCGGCCTAGTTGGGGGATGGAATA 
TGCGGAACCCTCCAGTGATTTAACCCTTCTCAAGTGCTAATTGCCTTAAGCTACAACAAA 
TAAGTGAGCTTAGGTTACCAGTTTTAACATAATTTTAACTTGTTTTGATCATATGAGCTT 
CGGAAGAATCATATTATCATCATATATGAACTTCTTTCCAAGAATGTTCTATGAGTTTTT 
TGATATGTATAATCAAGAGAATCGTTTGAAGTAAAAA 

>G1040 Amino Acid Sequence (domain in AA coordinates: 109-158) 
MMMLESRNSMRASNSVPDLSLQISLPNYHAGKPLHGGDRSSTSSDSGSSIjSDLSHENNFF 
NKPLLSLGFDHHHQRRSNMFQPQIYGRDFKRSSSSMVGLKRSIRAPRMRWTSTLHAHFVH 
AVQLLGGHERATPKSVLEIiMET^^ 

IEDNNNNEEADEGTDTNSPNSSSVQKTQRASWSSTKEVSRSISTQ 
EKEDTNIHLNLDFTLGGLVGGWNMRNPPVI * 
>G2114 (64. .1311) 

ATAAAACGAAACCCTATACATATAAACTAAGAGCGAGAAAGACAGCTAGAGAGAGAGAGA 
GAGATGAAGAAATGGTTGGGATTTTCATTGACACCTCCTTTGAGAATCTGCAATAGTGAA 
GAAGAAGAACITAGGGATGACGGTTCCGATGTTTGGAGATATGATATTAACTTTGATCAT 
CATCATCATGATGAAGACGTTCCAAAGGTGGAAGATCTCCTCTCAAACTCTCATCAAACC 
GAGTATCCTATAAACCATAACCAAACCAATGTCAACTGCACCACTGTGGTTAAC^GGTTA 
AACCC^CCCGGTTACCTTCTCCACGACCAAACCGTAGTTACACCACATTACCCGAACCTA 
GATCCGAACCTTAGCAATGATTATGGAGGTTTTGAGAGGGTCGGTTCGGTCTCGGTTTTC 
AAATCTTGGTTAGAGCAAGGCACTCCAGCATTCCCACTCTCGAGTCATTACGTTACTGAA 
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GAGGCTGGTACGAGCAATAATATTAGTCATTTTAGTAACGAAGAGACTGGTTATAACACC 

TCGAATGTATCCGCACGGGTCGAAGAACCGGTTAAGGTAGATGAGAAGCGGAAGAGATTG 
GTTGTTAAACCTCAGGTAAAGGAATCCGTTCCTCGGAAGTCGGTTGATAGTTATGGACAA 
AGAACTTCTCAGTATCGTGGAGTTACAAGGCATAGATGGACAGGGAGATATGAAGCTCAC 
TTATGGGATAATAGCTGTAAGAAGGAGGGACAGACAAGGAGAGGAAGACAAGTGTATCTT 
GGAGGGTATGATGAGGAGGAGAAAGCAGCGAGGGCATATGATTTAGCGGCTCTGAAGTAT 
TGGGGTCCTACCACTCACTTAAATTTCCCTTTGAGTAATTACGAAAAGGAGATCGAGGAA 
CTCAATAACATGAATCGGCAAGAATTTGTTGCCATGTTGAGGAGGAATAGCAGCGGGTTT 
TCGAGGGGAGCTTCCGTGTATAGAGGAGTTACAAGGCATCATCZAACATGGAAGGTGGCAA 
GCCAGAATTGGAAGAGTTGCTGGAAAGAAGG ACTTGT^ CACG CAA 

GAAGAAGCAGCGGAGGCGTACGATATCGCGGCAATTAAATTCAGAGGCCTAAACGCTGTA 
ACCAATTTCGATATAAATAGATATGACGTGAAGAGGATATGTTCAAGCrCAACGATTGTT 
GATAGCGACCAGGCCAAACATTCTCCCACCAGCTCTGGCGCCGGCCACTAACCGACACCG 
TAAACTCCTCGCCGGAGAGACTATTCCCACGTACGGTTGGTTTGAGGAAATAAGTTCGTC 
CAGTCTGTTTAATCATTTATGGTrTAATAAACATATATTCCTAAGTAATTGAGGCCGGTC 
TACATATATACAACTTTTTTAGCAAATTAAGTTATCAGAATCCACTATATATTAT^ 

>G2114 Amino Acid Sequence (conserved domain in AA coordinates : 221-297 , 323-393) 

MKKWLGFSLTPPLRICNSEEEELRHDGSDVWRYDI^ 

YPINHNQTNWCTTVVI^ 

SWLEQGTPAFPLSSHYVTEEAGTSNNISHFSNEETGY3STTNGSMIiSIiALSHGAC 

NVSARv^EPVKVDEKRKRLVVKPQVKESVPRKSV^ 

WDNSCKKEGQTRRGRQWLGGYDEEEKAARAYDIiATUjKYWGPTra 

NNMMRQEFVAMLRRNS SGFSRGASVYRGVTRHHQHGRWQAR I GRVAGNKDLYLGTFSTQE 
EAAEAYDIAAI KFRGLNAVTNFD INRYDVKRI CS S S T I VDSDQAKHS PTS SGAGH* 
>G450 (65.. 751) 

GAGTTATCGAGAGAGAGAGAAAACATATTCTGATTTAAGACATATATAGACAGCAAGAAG 
AGATATGAACCTTAAGGAGACGGAGCTTTGTCTTGGCCTCCCCGGAGGCACTGAAACCGT 
TGAAAGTCCGGCCAAGTCGGGTGTTGGGAACAAGAGAGGCTTCTCCGAGACCGTTGATCT 
CAAACTTAATCTTCAATCTAACAAACAAGGACATGTC 

CAAGGAGAAGACCTTCCTTAAAGACCCTTCTAAGCCTCCTGCTAAAGCACAAGTGGTGGG 
TTGGCCACCGGTGAGGAACTACCGGAAAAATGTTATGGCTAATCAGAAGAGCGGCGAAGC 
AGAGGAGGCAATGAGTAGTGGTGGAGGAACCGTCGCCTTTGTGAAGGTTTCCATGGATGG 
AGCTCCTTATCTTCGGAAGGTTGACCTCAAGATGTACACCAGCTACAAGGATCTCTCTGA 
TGCCTTGGCCAAAATGTTCAGCTCCTTTACCATGGGGAGTTATGGAGCACAAGGGATGAT 
AGATTTCATGAACGAGAGTAAAGTGATGGATCTGTTGAACAGTTCTGAGTATGTTCCAAG 
CTACGAGGACAAAGATGGTGACTGGATGCTCGTTGGTGATGTCCCCTGGCCGATGTTTGT 
CGAGTCATGCAAACGTTTGCGCATAATGAAAGGATCCGAAGCAATTGGACTTGCTCCAAG 
AGCAATGGAGAAGTTCAAGAACAGATCATGAACAAAAAAAAAAGAGGACAATATGCATTG 
ATTTTTTTTTTTTTGGTATTGTTATGATCATGTGTTTTAATTTAAAATATAGGAAGGATA 

AATTAGTCTGTGTTTTTGTTTTCATCTCTTAATTAGTAGAAATCATTTTTTAATATC 
TTGTGATAGTAAATCTATAGAGTTCGTA 

>G450 Amino Acid Sequence {domain in AA coordinates: TBD) 
MNLKETELCLGIiPGGTETVE S PAKSGVGNKRGFS ETVDIjKLNLQ SNKQGHVDLNTNGAPK 
EKTFLKDPSKPPAKAQWGWPPVRNYRKNVMAN^^ 
PYLRKVDLKMYTSYKDLSDALAKMFSSFTMGSYG^ 

EDKDGDWMLVGDVPWPMFVESCKRLRIMKGSEAIGLAPRAMEKFKNRS* 
>G584 (40.. 1809) 

AAAAAGTCTTCTCTTTTATAACTACGTGAGAGAACTGTTATGTCTCCGACGAATGTTCAA 
GTAACCGATTACCATCTCAACCAATCAAAAACGGATACAACAAATCTCTGGTCAACCGAC 
GACGATGCATCGGTAATGGAAGCTTTCATCGGCGGCGGCTCCGATCATTCTTCTCTTTTT 
CCTCCACTTCCTCCTCCTCCTCTTCCTCAAGTCAACGAAGATAATCTCCAGCAACGTCTC 
CAAGCTTTAATCGAAGGAGCAAACGAGAACTGGACTTACGCCGTGTTCTGGCAATCATCT 
CACGGTTTCGCCGGAGAAGACAACAAGAACAACAACACAGTGTTGTTAGGTTGGGGAGAT 
GGTTATTACAAAGGAGAAGAAGAGAAGTCTAGAAAGAAGAAATCAAATCCAGCTAGTGCA 
GCTGAACAAGAGCATCGTAAGAGAGTGATTAGAGAGCTCAACTCTTTAATCTCCGGTGGT 
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GTAGGAGGAGGAGATGAAGCTGGAGATGAAGAAGTTACAGATACTGAATGGTTCTTCTTA 
GTTTCAATGACACAGAGCTTTGTCAAGGGTACTGGT 

TCAGACACGATTTGGTTATCTGGTTCTAATGCTTTAGCTGGATCAAGTTGTGAGAGAGCT 
CGTCAAGGTCAGATTTATGGGTTACAAACAATGGTGTGTGTAGCGACAGAGAATGGTGTC 
GTTGAGCTTGGTTCGTCGGAGATTATTCATCAAAGTTCAGATCTTGTTGATAAAGTTGAC 
ACCTTTTTCAATTTTAACAATGGTGGTGGTGAATTTGGTTCTTGGGCGTTTAATTTGAAT 
CCAGATCAAGGAGAGAATGATCCAGGTTTGTGGATTAGTGAACCTAATGGTGTTGACTCT 
GGTCTTGTAGCTGCTCCGGTGATGAATAATGGTGGAAATGACTCAACTTCTAATTCTGAT 
TCTCAACCAATTTCTAAGCTTTGTAATGGAAGCTCTGTTGAAAACCCTAACCCTAAAGTT 
CTGAAATCTTGTGAAATGGTGAATTTCAAGAATGGGATTGAGAATGGTCAAGAAGAAGAT 
AGTAGTAATAAGAAGAGATCACCGGTTTCGAATAATGAAGAAGGGATGCTTTCTTTTACC 
TCTGTTCTTCCATGTGACTCGAATCACTCTGATCTTGAAGCTTCAGTGGCTAAAGAAGCT 
GAGAGTAACAGAGTTGTGGTTGAACCGGAGAAGAAACCGAGGAAACGAGGGAGAAAACCG 
GCGAATGGAAGAGAAGAGCCTTTGAATCATGTAGAGGCAGAGAGACAGAGAAGAGAGAAG 
TTGAATCAGAGATTCTATTCTTTAAGAGCTGTGGTTCCTAATGTGTCTAAGATGGATAAA 
GCTTCTCTATTAGGAGATGCTATTTCGTATATCAGTGAGCTTAAGTCTAAGTTGCAAAAG 
GCTGAATCTGATAAAGAAGAGTTGCAGAAGCAGATTGATGTGATGAATAAAGAAGCGGGA 
AATGCGAAAAGTTCGGTAAAAGATCGT^AAATGTTTGAATCAAGAATCGAGTGTGTTGATA 
GAGATGGAGGTTGATGTGAAGATTATTGGTTGGGATGCAATGATAAGGATTCAATGTAGT 
AAGAGGAATCATCCTGGTGCTAAGTTCATGGAAGCACTTAAGGAGTTGGATTTGGAAGTG 
AATC^TGCGAGTTTATCGGTAGTGAATGATCTTATGATCCAACAAGCGACTGTGAAAATG 
GGGAATCAGTTTTTCACG(^y\GATCAACTCAAGGTTGCTCTAACGGAGAAAGTTGGAGAA 
TGTCCATGAATTGAAGTCAGCATCTTTAGGGCTAATACACCGGAGAATACTGCGAAAAGT 
CGAAAACAACGATCATAGTATAAGCCGCGGTAAAAAGTGTTAAACCTTTCACACAAGTTT 
CTCTAGTGAATGTAGTTGTAAACTCTATTGTGTAAGGGTAATTTTGTAGTACCCACTTGT 
TGCTATTGAATGCTTGTTAGAGAGGATTCTTAGTGTAGTATATGATTAGGTTGGGGTTTG 
TTGTTTCATGAGATAAATAAATGTGTTTGATCAA 

ATGTATGTAAATAAGGCTTTTGTTAGAAATAAGACAAATGGGACTGAAGTTGGAGTTTAA 
AA 

>G5 84. Amino Acid Sequence (domain in AA coordinates: 401-494) 

MSPTNVQVTDYHLNQSKTDTTNIiWST^ 

DNLQQRLQALIEGANEIWTYAVFWQSSHGFA 

KSNPASAAEQEHRKRVIRELNS L I SGGVGGGDEAGDEEVTDTEWFFLVSMTQS FVKGTGL 
PGQAFSNSDTI WLSGSNALAGS S CERARQGQI YGLQTMVCVATENGWELGS SEI IHQS S 
DLVDKVDTFFNFNNGGGEFGS WAFNIjNPDQGENDPGLW I SEPNGVD SGLVAAPVT^NNGGN 
DSTSNSDSQPISKLCTSTGSSVENPNPKVLKSCEiy^ 
EGMLSFTSVLPCDSiraSDIiEASVAKEAESNRVVVEPEKKPRKRGRI^ 

ERQRREKLNQRFYSLRAWPNVSK^KASLLGDAISYISELKSKLQKAESDKEELQKQID 
VMNKEIAGNAKS S VKDRKCIaNQE S S VL I EME VD VKI I GWDAM I R I Q C S KRNHPG AKFMEAL 
KELDLEVNHASLSVVNDLMIQQATV^ 
>G668 (1..1056) 

ATGGGAAGACCACCTTGCTGTGAAAAGATTGGAGTGAAGAAAGGGCCATGGACACCAGAG 

GAAGAGATCATCTTGGTTTCTTACATCCAAGAACATGGTCCTGGAAACTGGAGATCTGTC 

CCAACACACACAGGTTTAAGATGTAGCAAGAGCTGCAGATTGAGATGGACTAATTATCTT 

CGACCCGGTATTAAGCGTGGAAATTTTACTGAGCATGAAGAGAAGACAATTGTTCATCTT 

CAAGCCCTTTTAGGCAACAGATGGGCAGCCATAGCATCATACCTTCC^GAAAGGACA 

AATGATATAAAGAACTATTGGAACACTCACTTGAAGAAGAAGCTCAAAAAGATTAATGAA 

TCTGGTGAAGAAGAFAATGATGGTGTCTCTTCATCAAACACTAGTTCACAAAAGAACCA^ 

CAAAGCACTAACAAAGGTCAATGGGAAAGAAGACTTCAGACAGAGAT^ 

CAAGCTCTTTGTGAGGCCTTGTCTTTAGACAAACCATC^TCCACTCTTTCATCAT 

TCATTACCGACACCAGTAATCACACAACAAAACATCCGTAACTTCTCAT(^GCTTTGCTT 

GACCGTTGTTATGATCCATCCTCTTCTTCTTCATCTACGACAACCACCACTACAAGCAAC 

ACTACTAATCCATACCCATCAGGGGTATATGCGTC7UVGTGCTGAGAACATCGCCCGGTTG 

CTTCAAGATTTCATGAAAGACACACCCAAGGCTTT^ 

TCAGAGACTGGACCACTCACTGCTGCAGTCTCGGAAGAAGGTGGAGAAGGGTTTGAACAA 

TCTTTCTTCAGCTTCAATTCAATGGACGAAACTCAAAACTTGACTCAGGAC^ 

TTCCATGATCAAGTGATCAAACCGGAAATAACAATGGACCAAGATCATGGTCTAATATCA 
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CAAGGGTCTCTGTCTTTGTTTGAGAAATGGTTATTTGATGAGCAAAGCCACGAGATGGTT 
GGTATGGCACTAGCAGGACAAGAAGGGATGTTCTAG 

>G668 Amino Acid Sequence (domain in AA coordinates: 13-113) 
MGRPPCCEKIGVKKGPWTPEEDIIIiVSYIQEHGPGNWRSVPTHTGLRCSKSCRLRVm^L 

RPGIKRGOTTEHEEKTIVHLQALLGNRWAA^ 

SGEEDNDGVSSSOTSSQIOraQSTNKGQWERRLQTDINMAKQALCEALSLDKPSS^^ 
SLPTPVITQQNIRNFSSALIiDRCYDPSSSSSSTTTTTTSNT™ 

LQDFMKMPKAIjTIjSSSSPVSETGPIjTAAVSEEGGEGFEQSFFSFNSITOETQNIjTQETSF 
FHDQVIKPEITMDQDHGLISQGSLSLFEKWLFDEQSHEMVGMAIiAGQEGMF* 
>G1050 (23.. 1582) 

TTCCCCATTTCAGAAAATCAAAATGGGTGGTGGTGGTGATACAACAGATACCAATATGAT 

GCAGAGAGTTAATTCTTCTTCTGGTACATCGTCTTCTTCGATCCCTAAACACAATCTTCA 

CTTGAATCCTGCTCTTATCCGCTCTCACGATCACTTCCGTCACCCTTTCACCGGAGCTCC 

TCCACCGCCGATTCC^CCCATTTCTCCTTAC^CTCAGATCCCGGCGACTTTACAACCTAG 

ACATTCTCGCTCTATGTCG(^CCGTCTTCTTTCTTCTCCTTTGATTCATTGC 

AAATCCTTCTGCTCCGTCGGTTTCGGTGTCGGTGGAGGAGAAAACCGGTGCCGGATTTAG 

TCCTTCGTTGCCTCCGTCACCGTTTACGATGTGTCATTCTTCTAGCTCTAGGAACGCCGG 

AGATGGAGAGAATCTACCTCCGAGAAAGTCGCATAGGCGTTCGAATAGTGATGTTACTTT 

TGGGTTTAGTTCAATGATGTCTCAGAATCAAAAGTCTCCTCCTTTGAGTTCTTTGG 

ATCGATCTCTGGTGAAGATACATCAGATTGGTCTAATTTGGTGAAGAAAGAACCGAGAGA 

AGGCTTCTACAAGGGAAGAAAACCAGAGGTTGAAGC!AGCTATGGACGATGTTTTCACGGC 

TTATATGAATCTTGATAACATTGATGTCTTGAATTCTTTTGGAGGTGAAGATGGCAAGAA 

TGGGAATGAGAATGTGGAGGAGATGGAGAGTAGTAGAGGTAGTGGTACAAAGAAGACGAA 

TGGTGGAAGTAGTAGTGATTCTGAAGGAGATAGCAGTGCGAGTGGGAATGTGAAGGTTGC 

GTTGAGTTCTTCTTCTTCAGGCGTGAAGAGAAGAGCAGGTCMAGATATTGCTCCTACTGG 

TAGACATTACAGGAGTGTTTCTATGGACAGTTGTTTCZATGGGGAAGTTGAATTTCGGCGA 

CGAATCATCGCTAAAGCTTCCGCCTTCTTGAT 

TGAAGGGAATTCAAGTGCTTATAGTGTTGAATTTGGAAAC 

AATGAAGAAGATTGCAGCTGATGAGAAACTCGCTGAGATTGTAATGGCTGACCCTAAGCG 

TGTTAAAAGAATCTTGGCGAACCGCGTATCTGCTGCACGTTCAAAGGAGCGGAAGACGCG 

ATAC^TGGCAGAGTTGGAACAC^GGTGCAGAC^CTTCAGACTGAAGCTACTACATTATC 

GGCTCAGCTCACACATTTGCAGAGAGATTCTATGGGGTTGAC^AAACCAGAACAGTGAGCT 

GAAGTTTCGTCTTC^GCTATGGAGCAGC^GC^C^ACTCCGCGATGCTCTGTCAGAGAA 

ACTGAATGAAGAAGTCCAGCGGTTGAAACTGGTGATAGGGGAGCCG7^ACCGCAGGCAAAG 

TGGGAGCAGCAGCAGCGAATCAAAGATGTCACTAAACCCGGAGATGTTTCAGCAGCTTAG 

CATAAGTCAGTTACAACACCAACAGATGCAGCATTCCAATCAGTGTAGCACAATGAAAGC 

AAAGCACACTTC?\AACGACTAGGGTAAGTAAAACTGCGATCCGCAGTTGTCTAGTTACAT 

ATATGATAAGAATCTTTTGTGCAGAGTTCTGTTTTTGGAAGTTTTAAAGAAACATATATA 

AAGATTATGTCCGGGAAATTTGATCATATTTCCTGAAACATACAC^ 

TAATGGAGGACTTTCTTTCTGGACCA 

>G1050 Amino Acid Sequence (domain in AA coordinates: 372-425) 
MGGGGDTTDTNMMQRVNSSSGTSSSSIPKHNLHL^ 

SPYSQIPATLQPRHSRSMSQPSSFFSFDSLPPIiNPSAPSVSVSVEEKTGAGFSPSLPPSP 

FTMCHSSSSRNAGDGENIiPPRKSHRRSNSDVTFGFSSMMSQNQKSPPLSSLERSISGEDT 

SDWSNLVKKEPREGFYKGRKPEVEAAMDDVFTAYMNLDNIDVIiNSFGGEDGKNG^ 

MESSRGSGTKKTNGGSSSDSEGDSSASGNVKVALSSSSSGVKRRAGGDIAPTGRHYRSVS 

MDSCFMGKLOTGDESSLKLPPSSSAKVSPTNSGEGNSSAYSVEFGNSEFTAAEMKKIA^ 

EKLAE IVMADPKRVKRI IiANRVS AARS KERKTRYMAELEHKVQTLQTEATTLS AQIiTHIiQ 

RDSMGLTNQNSEI»KFRLQAMEQQAQLRDALSEK3jNEEVQRLKLVI GEPNRRQSGS S S SES 

KMSLNPEMFQQLSISQLQHQQMQHSNQCSTMKAKHTSND* 

>G1463 (199. -1209) 

TATCCTTCGCAAGACCCTTCCTCTATATAAGGAAGTTCATTTCATTTGGAGAGGACACGC 
TGACAAGCTGACTCTAGCAGATCTGGTACCGTCGAC^GTTTGAGATTTGCTTCATCCGGT 
TTTTTTATTTTCTGCAAAATATGTCACTCTCTCCCATTTTO 

AAGTTTGATCAACTTAGTATGCGTTTCTTTTTCTCTCTAGTTCCTCTGTTTCTTGGTCGA 
TTTAGTTTCGTTATGGCGGACACACTGCTCAACGCA.GAAGACGAAGTAATAATCTCACGT 
TATCTGAAGCCTATGATCGTTAACAGAGTATCATGGCCTGATCTCTTCATCGAAGACGCA 
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GACGTGTTCAACAAGGATCCZATATGTGAA.GTTCCATGCTGAGATCCCTAGCTTCGTGATC 
GTTAAACCACGAACAAAGGCTTGTGGTAAAACCGATGGATGTGATTCGGGTTGCTGGAGG 
ATCATTGGTCGTGATAAGCTGATAAAGTCGGAGGAGACTGGTAAGATTCTAGGGTTCAAG 
AAGATACTCAAGTTCTGCCTAAAGTGGAAACCTAGAGAATACAAGAGAAGTTTGGTAATG 
GAAG AGTATAGG CTTAC CAATAACTTCAACTGGAAG GAAGATCATGTGATTTGCAAGATT 
CGG CTTTTGTTTGAAG CAGAAATTAGTTTCTTGCTAGCCAAG CATTTCTACACTACATCA 
GACTCACTTCCTCGAAATGTGCTGTTGCCAGCTI^ATGGATTCTGTTCACCAGATAAACAA 
GAGGAGGACGAATTTTATCCGGTGACGATAATGATTTCAGAAGGAAAAGATTGGCCTAGC 
TACGTTACCAACAACGTGTATTGTCTGCATCCATCGGAGCTTGTGAA 

AAGTTTCATGATAACGGAATCTGCATCTTCGCTAACAGGACTTGTGGTGTAACCGATAAA 
TGCAATGAAGGTTACTGGAAGATTAAGCACCGTGAGAAGCTGATCATGTCACGGTACGGG 
CAGACC^TTGGTTGGAAGAAAGTTTTTC&GTT:LTATC 

GGTAATGGAGAAGAAGTGAAGGTAACTTGGACTCTAAAAGAGTATAGGCTTACCAGAAAA 

ATGAACAAGAATAAAGTGGTGTGCGTTATCAAGTATAAGGTAAAGTGTTTACCGAGGATA 

ACTAGCTAGGGACTTCTACTCTTGGTTTCATGATCGATGCGACCGCTCTAGACAGGCCTC 

GTACCGGATCCTCTAGOTAGAGCTTTCGTTCGTATC^TCGGTTTCGACAACGTTCGTCA 

>G1463 Amino Acid Sequence (conserved domain in AA coordinates : 9-156) 

MRFFFSLVPLFLGRFSFVMADTLLNAEDEVII 

PYVKFHAEIPSFVIVKPRTKACGKTDGOTSGCWRII^ 

LKWKPREYKRSLVMEEYRXiTNNFNWKQDHVICKIRLLFEA^ 

VTjLPAYGFCSPDKQEEDEFYPVTIMISEGKDWPSYVTNlfVYC^ 

I CI FANRTCG VTDKCNEGYWKIKHREKXtlMSRYGQTIGWKKVFQFYETEKERHFGNGEEV 

KVTWTLICEYRLTRKMNKNKVVCVIKYKVKCLPRITS* 

>G1944 (236.. 1306) 

TCGACCTTCCTAATTTCCAACCTCTGTTCTTAGCAATATATTTTTTCTCCAAAAATAATT 

CTGftLGTTTGATTTTCTTCTTCTAGCTCTTAAGTATATTTCTTTGTTGTTATTTATCTTTT 

AATCCTTTAATCTCATCTTTGTTTATCTTTAATCAAAACCCAAAATTT 

GAAAATCTAGAAGAAATAAAGGAAACATAACAAAAATAGAAAGAAAAAGAAGCTAATGGT 

CTTAAATATGGAGTCTACCGGAGAAGCTGTTAGATCAACCACCGGTAACGACGGTGGTAT 

TACGGTGGTTAGATCCGACGCGCCGTCAGATTTCCACGTAGCTCAAAGATCAGAAAGCTC 

AAACC^^TCTCCC^CCTCTGTCACTCCTCCTCCACCAC^^ 

TCCTCCGCCGCTGCAAATTTCGACGGTGACGACTACGACTACGACGGCCGCGATGGAAGG 
TATCTCCGGTGGACTGATGAAGAAGAAGCGTGGACGGCCAAGGAAGTATGGACCGGACGG 
GACTGTTGTAGCGTTATCTCCTAAACCGATTTCATCAGCGCCGGCGCCGTCGCATCTTCC 
GCCGCCGAGTTCACACGTCATCGATTTCTCCGCTTCTGAGAAACGTAGCAAAGTGAAA.ee 
AACGAACTCGTTTAACAGAACAAAGTATCATCACCAAGTTGAGAATTTGGGTGAATGGGC 
TCCTTGCTCCGTCGGTGGTAATTT(^CACCTCATATAATCACAGTCAACACCGGCGAGGA 
TGTAACAATGAAGATAATCTCGTTTTCGCAACAAGGACCTCGCTCTATTTGTGTTCTGTC 
AGCAAACGGTGTTATTTCAAGCGTTACACTTCGTCAGCCAGATTCCTCTGGCGGCACATT 
GACATACGAAGGTCGGTTTGAGATATTATCATTATCCGGGTCATTCATGCCTAATGATTC 
AGGCGGAACACGAAGTAGAACGGGAGGAATGAGTGTATCGTTAGCAAGTCCCGATGGACG 
TGTAGTAGGCGGTGGCCTCGCCGGTTTACTAGTAGCCGCGAGTCCGGTTCAGGTGGTTGT 
AGGAAGTTTTTTAGCGGGCACTGACCATCAAGATCAGAAACCGAAAAAGAACAAACATGA 
TTTCATGTTGTCGAGTCCTACCGCTGCAATTCCTATCTCTAGTGCAGCTGATCACCGGAC 
AATCCATTCGGTCTCGTCTCTTCCGGTCAATAATAATACATGGCAGACTTCTTTAGCTTC 
CGATCCAAGAAACAAGCATACCGATATTAATGTCAATGTAACTTGAAATCCAATCTTTCT 
CTGTATTTTCTGTTAACAAGTTTGATTTGGTTGTTTATCTACATTAGGATTTTACTAAAA 
TGGTAGTATTATTTATA(^GTTTTAGGGTCTTTATTTTGGTTCCACTGTTGTCACTTGTA 
GGATA 

>G1944 Amino Acid Sequence (domain in AA coordinates : 87-100) 
^^VI^NMESTGEAVRSTTGNDGGITvVRSDAPSDFHVA 

TAPPPLQISTVTTTTTTAAMEGISGGLMKKia^GRPRKYGPDGTVVALSPKPISSA 

LPPPSSHVIDFSASEKRSKVTCPTNSFNRTKYHHQVENLGEWAPCSVGGOT 

EDVTMKIISFSQQGPRSICVLSANGVISSVTIjRQPDSSGGTL^ 

DSGGTRSRTGGMSVSIiT^PDGRWGGGIiAGLLVAASPVQVWGSFLAGTDHQDQKPKKNK 

hdfi^ssptaaipissaadhrtihsvsslpvnnotwqtslasdprnkhto 

>G2383 (37. .990) 
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GACCTCTTTGATCCCTTCATTCCCCATCAAACAACCATG 

ATTCAAAGCCCTAATTCTCACCATCACTACTCTTCGCCTTCTTTTCCTTTCTCTTCCGAT 
TTTCTTGAGAGTTTTGATGAATCCTTCTTGATAAACCAATTCTTGTTACAGCAGCAAGAT 
GTAGCAGCAAATGTTGTTGAATCTCCTTGGAAATTTTGCAAGAAGCTTGAGCTTAAGAAG 
AAGAATGAGAAGTGTGTTGATGGAAGCACCTCACAAGAGGTTCAATGGAGAAGGACGGTC 
AAAAAAAGGGACAGGCATAGTAAGATCTGCACGGCTCAAGGTCCTAGAGACCGGAGGATG 
AGGCTGTCTCTTCAGATTGCTCGCAAGTTTTTCGAT 

AAGGCGAGCAAGACGATTGAATGGCTTTTCTCCAAATCAAAGACTTCCATCAAACAACTT 
AAAGAAAGAGTGGCTGCATCGGAAGGAGGAGGAAAGGATGAACATCTCCAGGTTGATGAA 
AAGGAAAAGGATGAGACACTGAAGTTGAGAGTCTCAAAGAGAAGAACAAAGACTATGGAG 
AGCTCTTTTAAGACTAAAGAGTCGAGAGAGAGAGCTAGAAAGCGAGCAAGAGAGAGAACA 
ATGGGAAAGATGAAGATGAGATTATTTGAGACCTCGGAAACA 

GAAACTAGAGAGATCAAGATAACCAATGGTGTAGAATTAOTAGAAAAGGAAAATAAAGAA 

CAAGAATGGAGTAATACTAATGATGTTCACATGGTAGAGTATCAAATGG ATTCTG TGAGC 

ATCATAGAGAAGTTTCTTGGACTAACCAGTGACTCTAGCTCCTCTTCCATTTTTG^ 

TCCGAGGAATGTTAGACAAGTCTTAGTTCAGTAAGAGGTACAATTTCAGCAGCAGGTAAC 

AGCAATGTGTTAACTAAAAACCCTAATTGAGTAATGCAGTTTTGATTAATATTAGCTTTT 

TGGTAATTCCAGGAATGTCGACACCAAGGG 

>G2383 Amino Acid Sequence (conserved domain in AA coordinates : 89-14 9) 

MFPSFITHIQSPNSHHHYSSPSFPFSSDFLESFDESFLINQFLLQQQDVAANVVESPWKF 

CKKLEIiKKKNEKCVX)GSTSQEVQWRRTVKKRDRHSKICTAQGPR^ 

LQDMLGFDKASKTIEWLFSKSKTSIKQLKERVAASEGGGKDEHLQVTDEKEKDETLKLRVS 

KRRTKTMESSFKTKESRERARKRARERTMAKMKMRIjFETSETI^ 

LLEKENKEQEWSNTITOVHMV^YQ 

GT I S AAGNSNVXiTKNPN* 

>G571 (326.. 1708) 

TAGCCGACCTCTCTTCTCTCTTCTGAAAAAAACACCAAAGGAGCTTTAAATGCTCCGTTA 
CATAATCTCTATCTCTTTCCAAGAATATAGAGAAAGGAAAATAATATACAAGAATTAAAA 
GAAGGTATATGATCATCTCTCTAGCTAGTGA 

ATCAGCTTGCCTCAGAGGAGAAGACCAACATAAGAGAGATCGAAGATCAAAATCTATCTC 
TCTTCATCATCTTCTGCTGTTACTATCATATCACACGCTCTCTCAAACATCATCCTATAT 
ATAGACTTCTCTTGATCATCATCAAATGCAAGGT 

ATCATC^TCCTCCGCCACGTCTTCCCATGGAAACTTCATGAACAAAGATGGGTATGATA 

TGGAGAGATAGACCCATCACTCTTCCTCTATCTTGATGGACAAGGACATCATGATCCTCC 

ATCAACTGCTCCTTCTCCTTTACATCATCATCACACAACTCAGAATTTGGCGATGAGACC 

TCCAACATCGACGCTGAAGATCTTTCGATCTCAGCCTATGCACATAGAGCCACCTCCT 

TTCTACACACAATACCGATAATACAAGATTAGTTCCGGCTGCTCAACCTAGTGGTTCCAC 

TCGACCAGCTTCTGACCCGTCCATGGACTTGACGAAT 

TCAAGGTTCTAAATCCATCAAGAAGGAAGGGAACCGCAAGGGTCTTGCCTCATCGGACCA 
TGACATACCTAAATCGTCAGACCCTAAAACATTGAGAAGACTAGCACAAAACAGAGAAGC 
AGCAAGAAAAAGCAGATTACGTAAAAAGGCTTATGTTCAGCAACTCGAGTCATGTAGGAT 
CAAACTGACCCAACTAGAACAAGAGATTCAACGGGCCAGATCCCAAGGCGTATTCTTTGG 
AGGGTCTCTTATAGGAGGAGATCAACAGCAAGGTGGACTACCCATTGGCCCTGGCAACAT 
CAGCTCTGAAGCAGCGGTGTTCGATATGGAATATGCGAGGTGGCTGGAGGAGCAGCAGAG 
GCTATTAAACGAACTAAGGGTGGCAACACAAGAACACTTGTCCGAGAACGAGCTTAGGAT 
GTTTGTGGACACATGTTTAGCTCATTATGACCATTTGATTAACCTCAAGGCTATGGTCGC 
TAAGACCGATGTCTTCCACCTCATTTCTGGAGCATGGAAAACTCCAGCTGAACGTTGCTT 
CTTGTGGATGGGTGGTTTCCGTCCATCGGAGATCATTAAGGTGATTGTGAACCAGATAGA 
ACC^TTGACGGAGCAACAGATAGTTGGGATATGTGGGCTGCAACAGTCCACACAAGAGGC 
CGAGGAGGCTCTCTCGCAAGGCCTCGAGGCGTTGAATCAATCACTTTCCGATAGCATTGT 
CTCTGACTCCCTCCCGCCTGCCTCCGCACCACTTCCTCCTCATCTATCCAATTTCATGTC 
ACACATGTCCTTAGCTCTCAACAAGCTCTCTGCTCTCGAGGGCTTCGTTCTCCAGGCGGA 
TAATTTGAGGCACCAAACGATCCATAGGCTGAACCAATTGTTGACGACCCGTCAAGAAGC 
ACGGTGTCTTCTAGCCGTTGCGGAGTACTTCCACCGTCTTCAAGCTCTAAGTTCTCTCTG 
GCTAGCCCGTCCTCGGCAAGATGGATAATACTAAAACAACTGATGAAGGAAACCAAAAAC 
AAAAACAAGAGAATAGGTTGATTAGTTAGCCGCCAGCTTGAC^ 

GTCTCTCTACTCAAATACAGTGCAATTAGGGAAAATTGTTTGGCTTCTTTTTGGTATATG 
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ATTCTTACTATTATGTTTTTAATCAAGA 

>G571 Amino Acid Sequence (domain in AA cordinates : 160-220) 

MQGHHQNHHQHLSSSSATSSHGNFMNKDGYDIGEIDPSLFLYLDGQGHHDPPSTAPSPLH 

HHOTTQNLAMRPPTSTLNIFPSQ 

DLTNHSQFHQPPQGSKSIKKEGNRKGLASSDHDIPKSSDPKTLRIUiA 

KAYVQQIjESCRIKXiTQIjEQEIQRARSQGVFFGGSIjIGGDQQQGGLPIGPGNISSEAAVFD 
MEYARWLEEQQRIiIjNEIjRVATQEHLSENELRMFViyrCLAHYDHIj INIiKAMVAKTDVFHL I 
SGAWKTPAERCFLWMGGFRPSEIIKVIVNQIEPLTEQQIVGICGIiQQSTQEAEEAIiSQGL 
EALNQSLSDSIVSDSLPPASAPLPPHLSNFMSHMSLALN^ 
RIjNQIJjTTRQEARCIjIjAVAB YFHRLQALS S LWLARPRQDG * 
>G636 (6.. 1814) 

CGATGATGCAACTGGGTGGTGGTACTCCGACCACTACAGCGGCGGCTACAACCGTCACAA 
CTGCTACAGCACCACCGCCACAATCAAACAACAACGATTCAGCGGCA 

CAGCAGCGGTTGGGGCGTTTGAGGTGTCGGAAGAGATGCACGACCGTGGGTTTGGAGGAA 
ATCGTTGGCCGCGGCAGGAAACGCTAGCGTTGTTGAAAATACGATCTGACATGGGAATAG 
CGTTTCGAGACGCTAGCGTTAAAGGTCCCTTATGGGAAGAGGTTTCTAGGAAAATGGCGG 
AGCATGGTTACATAAGAAACGCAAAGAAATGCAAAGAGAAATTCG^ 

ACCACAAACGAACCAAAGAAGGTCGTACCGGAAAATCCGAAGGCAAAACTTATCGCTTCT 

TTGATCAATTAGAAGCTCTCGAGTCTCAATCTACAACCTC^ 

AAACGCCTCTTCGACCACAGCAAAACAACAACAAC 

CCATATTTTCAACTCCTCCTCCGGTAACGACAGTTATGCCGACGCTTCCTTCTTCATCAA 
TTCCTCCGTATACTCAGCAGATTAATGTACCTTCGTTTCCAAAGATCTCCGGTGATTTTG 
TATCGGATAATTCTACATCGTCTTCGTCTTCTTATTCGACTTCTTCTGACATGGAGATGG 
GTGGTGGAACTGCGACTACAAGGAAGAAAAGGAAGAGGAAATGGAAGGTGTTTTTCGAGC 
GGTTGATGAAACAAGTAGTTGATAAACAGGAAGAGCTTCAACGCACATTCTTGGAAGCTG 
TTGAAAAGCGAGAACACAAGAGATTGGTTAGAGAAGAGTCTTGGAGAGTTCAAGAGATTG 
CCAGAATCAACCGCGAGCACGAGATCTTAGCTCAAGAACGCTCTATGTCCGCTGCAAAAG 
ACGCTGCTGTTATGGCCTTTCTTCAAAAACTGTCAGAGAAACAACCGAATCAGCCACAAC 
CGCAGCCTCAGCCGCAACAAGTTCGACCATCAATC 

AACCGCCTCAACGGTCTCCTCCACCGCAACCTCCTGCTCCGCTTCCGCAGCCAATTCAAG 
CGGTTGTGTCGACGTTAGACACAACGAAAACGCACAATCGTGGTGATCAGAATATGACTC 
CTGCAGCTTCAGCGAGCTCGTCGCGGTGGCCGAAAGTGGAGATAGAAGCATTGATAAAGC 
TGAGGACGAATCTTGATTCGAAATATCAAGAAAACGGACCAAAAGGACCATTGTGGGAAG 
AGATATCAGCGGGAATGAGAAGGTTAGGATTCAACAGGAACTCAAAGAGATGCAAAGAGA 
AATGGGAAAACATAAAGAAATACTTCAAGAAAGTCAAAGAGAGCAACAAGAAACGTCCCG 
AAGATTCCAAGACTTGCCCTTACTTTCACCAGCTTGATGCTTTATATAGAGAGAGGAACA 
AATTCCACAGCAAGAACAACATTGCAGCTTCTTCTTCATCTTCCGGTCTTGTTAAACCGG 
ATAATTCTGTTCCCTTGATGGTCCAACCAGAGCAGCAATGGCCTCCGGCTGTAACGACTG 
CGACAACTACTCCCGCAGCGGCTCAGCCTGATCAGCAATCTCAGCCGTCGGAGCAGAACT 
TTGATGATGAAGAAGGTACAGATGAAGAGTACGACGATGAAGATGAGGAAGAGGAGAATG 
AAGAAGAGGAAGGAGGTGAGTTCGAGCTTGTGCCTAGCAATAACAACAACAACAAGACGA 
CGAATAATCTGTAATGATGATGATTCGAGTTCGAACCGGTTTGGTGGTGAAAGATTAGTA 
ATCTTTTTTTAAGTTTTGATACAGAACATGAGAATTTAAATATTGGAGGGTTT 

>G636 Amino Acid Sequence (domain in AA coordinates: 55-145, 405-498) 
MQLGGGTPTTTAAATTVTTATAPPPQSNNNDSAA 

WPRQETLALLKIRSDMGIAFRDASVKGPL^EVSRKMAEHGYIRNAKKCKEKFENVYKYH 
KRTKEGRTGKSEGKTYRFFDQLEALESQSTTSLHH^^ 

FSTPPPVTTVMPTIiPSSSIPPYTQQINVPSFPNISGDFLSDNSTSSSSSYSTSSDMEMGG 

GTATTRKKRKRKWKOTFERLMKQVVDKQEELQRTFLEAVEKREHKRL 

INREHEILAQERSMSAAKDAAVMAFLQKLSEKQPNQPQPQPQPQ 

PQRSPPPQPPAPLPQPIQAVVSTLDTTKTHNRGDQNMTPAASASSSRWPKVEIEALIKLR 
TNLDSKYQENGPKGPLWEEISAGMRRLGFNRNSKRC^ 

S KTCP YFHQIjDAIjYRERNKFHSNNNIAAS S S S SGLVKPDNSVPIjMVQPEQQWPPAVTTAT 

TTPAAAQPDQQSQPSEQNFDDEEGTDEEYDDEDEEEENEEEEGGEFELVPS*^^ 

NL* 

>G878 (197.. 1738) 

CAAAAAAAATCTCTCCCATTAAAAGACTGCCCAAAGAAATATTTTATACAAAATGAAAGA 
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GAGAAACACGACACGAATTTTGTATAATTAAGATTACACAAAAAAAAGTGTTAGAAAGAG 
AAATATCTTCTTCTTTTTTCTOTGTGAGTTC 

TCAAAATCAAGAATCGATGGCGGAGAAGGAAGAAAAAGAACCATCGAAGTTAAAATCATC 

CACCGGAGTTTCACGGCCAACGATTTCACTACCTCCTCGACCGTTTGGTGAAATGTTTTT 

TAGCGGTGGCGTTGGATTTAGTCCTGGACCAATGACTCTCGTCTCAAATTTATTCTCTGA 

TCCTGATGAGTTCAAGTCTTTCTCTCAGCTTTTAGCTGGAGCTATGGCTTCTCCGGCGGC 

AGCTGCTGTTGCCGCCGCTGCTGTGGTTGCTACTGCTCATCATCAGACACCTGTGAGCTC 

TGTCGGTGATGGCGGTGGAAGCGGTGGTGATGTTGACCCGAGGTTTAAGCAGAGTAGACC 

AACGGGATTGATGATAACTCAACCACCGGGGATGTTTACTGTACCGCCGGGGTTAAGTCC 

GGCTACTCTTTTGGATTCTCCGAGCTTCTTTGGTCTTTTTTCACCTCTTCAGGGAACATT 

TGGTATGACACATGAACAAGCTTTAGCACAAGTCACTGCACAAGCAGT^ 

TGTTCATATGCAGCAATCACAACAATCTGAATAT 

ACAACAACAACAAGCTTCATTGACTGAGATTCCATCATTTTCTTCT 

GATTCGAGCCTCGGTTCAAGAAAGATCGCAGGGTCAGAGAGAGACTTCGGAAATATCTGT 

CTTTGAGCATCGGTCACAGCCTCAAAATGCTGACAAACCAGCTGATGATGGATACAACTG 

GCGGAAATATGGGCAGAAGCAAGTGAAGGGGAGCGATTTTCCTCGGAGTTATTACAAATG 

TACGCATCCAGCTTGTCCTGTCAAGAAGAAAGTGGAGAGGTCACTCGATGGACAAGTAAC 

GGAAATCATCTACAAGGGTCAAGACAATCATGAGCTTCCTCAAAAGCGCGG 

CGGGAGTTGTAAAAGTTCTGATATTGCAAATCAGTTTCAAACAAGTAATAGCAGTCTCZAA 

CAAGAGTAAGAGGGACCAGGAAACAAGCCAAGTTACAACAAGAGAGC^ 

AAGTGATAGCGAGGAGGTTGGGAATGCAGAGACTAGTGTGGGAGAAAGACATGAGGATGA 

GCCTGATCCCAAGCGAAGAAATACAGAAGTTCGGGTTTG 

TAGAACTGTGACAGAGCCTAGGATTATTGTCCAAACGACGAGTGAAGTTGACCTCTTAGA 

TGATGGATATAGGTGGCGCAAGTATGGTCAGAAAGTAGTCAAAGGAAATCCTTATCCGAG 

GAGCTACTATAAGTGTACAACACCAGATTGCGGAGTAAGGAAACATGTAGAGAGAGCAGC 

AACTGACCCAAAAGCTGTTGTAACAACATATGAAGGTAAACATAACCATGATGTTCCAGC 

TGCTAGAACCAGCAGCCATCAGTTAAGACCAAACAATCAACACAACAC 

CTTCAATCATCAACAGCCTGTTGGACGTTTAA 

GAGAAGAAGAATACGACGGCGCTTGAGCTTTTGTGAGTTTAATGAATCTTCTTTTTGGTT 

AATGAACCTGTTTTTGTTGCCTCAAAAGACCACAGGTTTCTCTGGACAGAATCTCTGATA 

TTACAGTTTCAAAAGGTATGTTCTTTTATTTCATGTTGGAATCTTCTGTGTAATCTTAAG 

AAGCTTTAGGAGGTAATGTAAAAAACCAGATTCAAAGTTATGCCCTTATGTGA 

TGTACATGGGATAAACAAAATTTACAGGTATCCTTTTTGTTCTTGTTGTAAAAAAAAAAA 

AAAA 

>G878 Amino Acid Sequence (domain in AA coordinates : 250 -305 , 415-475) 

MAEKEEKEPSKLKSSTGVSRPTISIiPPRPFGEMFFSGGVGFSPGPMTIiVSNLFSDPDEFK 

SFSQLIAGAMASPAAAAVAAAAWATAHHQTPVSSVGDGGGSGGDVDPRFKQSRPTGLMI 

TQPPGMFTVPPGLSPATLLDS PSFFGLFS PLQGTFGMTHQQAIiAQVTAQAVQGNNVHMQQ 

SQQSEYPSSTQQQQQQQQQASLTEIPSFSSAPRSQIRASVQETSQGQRETSEISVFEHRS 

QPQNADKPADDGY1WRKYGQKQVKGSDFPRSYYKCTHPACPVKKKVERSLDGQVTEIIYK 

GQHNHELPQKRGNNNGSCKSSDIANQFQTSNSSLl^SKRDQETSQVTTTEQMS^ 

VGNAETSVGERHEDEPDPKRRNTEVRVSEPVASSHRTVTEPRIIVQTTSEVDLLDDGYRW 

RKYGQKVVTCGNPYPRSYYKCTTPDCGVRKHVERAATDPK^ 

HQLRPNNQHNTSTVNFNHQQPVARXiRLKEEQIT* 

>G1134 (61.. 849) 

TAAAGAAAGAGAAAAAAAGCTTTCGTAGTGTCTATTGAAACCAGAGAAAAGCCAAAGGGG 
ATGCAACCAACATCCGTCGGTAGTAGCGGCGGTGGTGACGACGGAGGAGGCAGAGGAGGA 
GGAGGAGGGCTAAGTAGAAGTGGACTATCTCGGATCCGTTCAGCTCCAGCGACTTGGCTT 
GAAGCTTTACTTGAGGAAGATGAAGAAGAGTCTTTGAAACOTAATCTTGGTCTCACCGAT 
TTGCTTACCGGGAACTCGAACGATTTACCGACAAGTCGCGGCTCGTTCGAGTTCCCGATT 
CCTGTTGAGCAAGGGTTGTATCAACAAGGTGGGTTTCACCGACAGAATAGTACTCCGGCG 
GATTTTCTTAGTGGTTCTGATGGATTTATCCAAAGCTTTGGGATTCAGGCGAATTACGAT 
TACTTATCGGGGAATATCGATGTTTCTCCGGGAAGTAAGCGGTCTAGAGAAATGGAAGCA 
CTCTTCTCTTCTCCTGAGTTTACTTCTCAAATGAAAGGAGAGCAAAGCAGCGGTCAAGTT 
CCTACCGGAGTATCAAGCATGTCGGATATGAACATGGAGAACCTTATGGAGGACTCTGTT 
GCTTTTAGGGTTCGGGCTAAACGTGGTTGCGCAACTCATCCCCGCAGCATTGCCGAGAGG 
GTACGAAGGACGCGGATTAGTGATCGGATAAGGAAGCTACAAGAGCTTGTACCTAACATG 
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GACAAGCAAACCAACACTGCAGACATGTT 

CAAAGGCAGATCCAGGAGTTAACAGAAGAACAGAAGAGGTGCACATGCATACCTAAGGAA 
GAACAATAAGGTTTGCTCCTGATTTGTTTTATATTTGCTTAACGGCAATGATCTGATCGA 
AAAATTCGAAAGATGATCITAGCTTGAATTTAGATC 

TTTGATAAATGGATGTAGGTGTAATATAAAATTTTTGTACAATAATGAAGAAAGTTAAAA 
AGAATTAATGAAAACATATATTCTTTATGATATAAAAAAAAAAA 

>G1134 Amino Acid Sequence (domain in AA coordinates: 198-247) 

MQPTSVGSSGGGDDGGGRGGGGGLSRSGLSRIRSAPATWLEALLEEDEEESLKPNLGLTD 

LliTGMSNDtiPTSRGSFEFPIPVEQGLYQQGGFHRQNSTPADFIiSGSDGFIQSFGIQANYD 

YLSGNIDVSPGSKRSREMEALFSSPEFTSQMKGEQSSGQVPTGVSSMSDMN^ 

AFRVRAKRGCATHPRS I AERVRRTRI SDRIRKLQELVPNMDKQTNTADMIjEEAVEYVKVL 

QRQIQELTEEQKRCTCIPKEEQ* 

>G1008 (89.. 973) 

GCCTTTTTGACTCTTCTTTCTCTCITCTACTTTTTTTCAGGCTCTCTCTCTATATC 

TCTTCTTCTCCGGTTAACTAAAAGAGAAATGAAAAGCCGAGTGAGAAAATCCAAGTACAC 

GGTTCACCGGAAAATCACATCGACACCGTTCGACGGT^ 

AGTCACTGACCCATGCGCTACTGATTCTTCCAGCGATGAGGAAAACGACAAGAAATCTGT 
TGCTCCGAGGGTGAAACGTTATGTGGATGAGATCAGGTTCTGTGACGAAGATGACGAACC 
TAAACCGGCGAGGAAAGCGAAGAAAAAGTCCCCGGCGGCTGCGGCGGAGAACGGTGGAGA 
TTTGGTAAAGTCTGTGGTGAAGTATAGAGGAGTGAGACAACGACCTTGGGGAAAATTTGC 
GGCGGAGATTCGTGATCCTTCGAGTCGTACTAGACTCTGGCTTGGGACTTTTGCGACGGC 
GGAGGAAGCTGCTATAGGTTACGATAGAGCCGCGATTCGAATCAAAGGTCATAACGCTCA 
GACGAATTTTCTCACTCCTCCTCCTAGTCCGACGACTGAGGTGTTACCGGAAACTCCGGT 
GATTGACCTTGAAACTGTCTCTGGTTGTGATTCGGCGAGGGAATCGCAAATCAGTCTGTG 
TTCTCCGACTTCTGTTCTCCGGTTTAGTCACAACGACGAAACAGAGTACAGAACAGAGCC 
AACGGAAGAACAAAATCCGTTTTTCTTGCCTGATTTGTTTCGCTCCGGAGATTATTTTTG 
GGATTCCGAAATTACCCCTGACCCTTTGTTTCTCGACGAATTCCACCAGTCCTTGTTACC 
AAAC^TCAAGAACAACAACACAGTGTGTGATAAGGATACGAAT 

GTTGGGAGTGATCGGAGATTTCAGCTCATGGGATGTTGATGAGTTTTTCCAAGATCATT 

GTTGGATAAGTAATTTGATGAGTTCTTCCCCAGAATTTTTCTGGGTTTCTCTTTTTGGTT 

GTGTGAGTGAGATGAGTGGTTTGATGACAACGACGGGGATGAATCTTAGCCGTCCGTTTT 

CCATTTCGTGGACGGCTCCGATCAGCGGAAGAAGCGCAACGGAGTTTTTATTTATCTGTT 

TGAGAATTTTATAATTTAATTTGCGAGTAAATATAGTAATTAGTGTTAAGATTGTGAGAG 

TTTAAGTTAATTAGGGAGGGGTTTTGAATATTGGGGATTTTGGGAGGTTTTTGTTTGGTT 

TCTCTCCAAGTCTGTCACTATGCAAGGAAGCAGTATAAAGACCGTATATATATTTTATTA 

TTAATATTGATAAAAGTAAAAAAAAAAAAAAAAA 

>G1008 Amino Acid Sequence (domain in AA coordinates: 96-163) 
MKSRVRKSKYTVHRKITSTPFDGFPKIVKI IVTDPCATDSSSDEENDNKSVAPRVKRYVD 
E I RFCDEDDEPKPARJKAKKKS PAAAAENGGDLVKS VVKYRGVRQRPWGKFAAE IRDPS SR 
TRLWIiGTFATAEEAAIGYDRAAIRI KGHNAQTNFLTPP P S PTTEVLPETP VIDLETVSGC 
DSARESQISIiCSPTSVLRFSHNDETEYRTEPTEEQNPFFLPDLFRSGDYFWDSEITPDPL 
FLDEFHQSLIiPNIJttlNNTVC^ 
>G1020 (132.. 689) 

CTGTTGACAAGAAAGCTCCCCAAAAGGAGCGTTGCTTTACTCTCCTATAAAAAGAAGCTC 
TTCTACTTCTTCTCGTTACCACAAAACTCTTTCACCGATCTTCTCGTTCCATTCTTCTTC 
CTAATTACACCATGCCCAACATCACCATGGGTTTGAAACCCGACCCGGTTGCTCCAACGA 
ACCCGACTCATCATGAGAGTAATGCTGCCAAAGAGATTCGTTACAGAGGCGTTAGGAAAC 
GTCCATGGGGAAGArPACGCCGCTGAGATCCGAGATCCGGTTAAGAAAACTCGAGTCTGGC 
TCGGTACGTTCGACACCGCTCAGCAGGCGGCGCGTGCTTACGACGCAGCCGCGCGTGACT 
TTCGTGGTGTTAAGGeTAAGACCAATTTCGGTGTTATCGTTGGTAGTAGTCCTACTCAGA 
GTAGCACCGTCGTCGACTCTCCCACGGCGGCACGGTTTATAACACCTCCGCACCTCGAGC 
TCAGCTTAGGCGGCGGCGGCGCGTGTCGTCGTAAGATCCCGCTTGTGCATCCGGTTTACT 
ACTATAACATGGCGACGTATCCAAAGATGACGACGTGTGGTGTCCAGAGCGAGTCTGAAA 
CGTCGTCGGTCGTTGATTTCGAAGGTGGAGCTGGGAAGATATCTCCGCCGTTAGATCTGG 
ATCTTAACTTAGCTCCTCCGGCGGAATAGGCCGTGAGTTTTTTTTTTCTTATGTCGTTTC 
TTTAGACAAAAAAA7VATAACGTTTCCTTTTTTTTTCTGCCTAAGAAAAAAATATTATCCG 
TTTTTTAGAAGAAAAAAAAAAAAAAAAAAAAAA 
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>G1020 Amino Acid Sequence (domain in AA coordinates : 28-95) 

MPNITMGLKPDPVAPTOPTHHESNAAKEIRYRGVRKRPWGRYAAEIi^ 

DTAQQAARAYDAAARDFRGVKAKTNFGVIVGSSPTQSSTVTO 

GGGACRRKI PLVHPVYYYNMATYPKMTTCGVQSES ETS S WDFEGGAGKI S PPLDLDLNL 
APPAE* 

>G1023 (252. .1250) 

TCGTCTTCTTAATCGCTTTCTGCTCTGTT 

CTCTCAGTGATTGATTTCTCACAGTTTCATCGATT^ 

CTTGTTCTGGGGTAAAGGACTTTTCTTGTTCTTC 

GGAATTTTGAGAGGTTTTTTAGGGTTTAAGGGGGTT^ 

TGTTCGATAAAATGGCTGAACGAAAGAAACGCTCTTCTATTCAAACCAATAAACCCAACA 
AAAAACCCATGAAGAAGAAACCTTTTCAGCTAAATCACCTCCCAGGTTTATCTGAAGATT 
TGAAGACTATGAGAAAACTCCGTTTCGTTGTGAATGATCCTTACGCTACTGACTACTCAT 
CAAGCGAAGAAGAAGAAAGGAGTCAGAGAAGGAAACGTTATGTCTGTGAGATCGATCTTC 
CTTTCX3CTCAAGCTGCTACTCAAGCAGAATCTGAAAGCTCATATTGTCAGGAGAGTAACA 
ATAATGGTGTAAGCAAGACTAAAATCTCAGCTTGTAGC^ 

CATCTCCGGTCGTTGGACGTTCTTCTACTACTGTCTCGAAGCCTGTTGGTGTTAGGCAGA 
GGAAATGGGGTAAATGGGCTGCTGAGATTAGACATCCAATCACCAAAGTAAGAACTTGGT 
TGGGTACTTACGAGACGCTTGAACAAGCAGCTGATGCTTATGCTACCAAGAAGCTTGAGT 
TTGATGCTCTGGCTGCAGCCACTTCT 

CTATGATCTCAGCCTGAGGGTCAAGCATTGATCTTGACAAGAAGCTAGTTGATTCGACTC 
TTGATCAACAAGCTGGTGAATCGAAGAAAGCGAGTTTTGATTTCGACTTTGCAGATCTAC 
AGATTCCTGAAATGGGTTGCTTCATTGATGACTCATTCATCCGAAATGCTTGTGAGCTTG 
ATTTTCTCrrTAACAGAAGAGAACAACAACCAAATGTTGGATGATTACTGTGGCATAGATG 
ATCTGGACATCATTGGTCTTGAATGTGACGGTCCAAGCGAACTTCCAGACTATGATTTCT 
CAGATGTGGAGATCGATCTTGGTCTCATTGGAACCACCATTGACAAGTATGCTTTCGTTG 
ATCATATCGCAACAACTACTCCCACTCCTCTTAATATCGCGTGCCCATAAGTTTTGCAGC 
TAGGTGTTATTATTAGCTATAGGAGCAACGTAAAAAGCTCGTTGTTACTCGGTTTTGTCT 
TAAGTTATTAAAGTATAGCAGAGGCAGTTAATCTCAAGGGAAGCAAAAACCCTAAAGATA 
GAAGCAGATGGAGTTTTGTGTGTTGGTGTTACTAAAGAAAGTTTTGTTGACATAATGGTT 
TTGATGTTGTGGAGAAGATAGAGAGGTGTGATCGAAATTGTAAATCTCAGGTGGTTTTTT 
TTGAAGGCAATTGTTTCTCATTTAGGGTTTTTTTCTATATGAGGATTGTCTTTGAAAAGC 
CTTTAGATGTTTTCTAATTCGTAAGCTCTCTCAATCTTTGTAAGTTTTGCCTGTTGAGTT 
ATTGATACATATGTGAGACCTACTTTATTTGTTTTGTGCTAGATACATTGTTGATGGTTT 
CGTCAAAAAAAAA 

>G1023 Amino Acid Sequence (conserved domain in AA coordinates : 128-195) 
MAERKKRSSIQTNKPNKKPMKKKPFQLNHLPGLS S SEE 

EERSQRRKRYVCEIDLPFAQAATQAESESSYCQESNNNGVS KTKI SACS KKVLRSKASPV 
VGRS STTVSKPVG VRQRKWGKWAAE IRHPITKVRTWLGTYETLEQAADAYATKKLEFDAL 
AAATSAASSVLSNESGSMISASGSSIDLDKKliVXlSTLDQQAGESKKASFDFDFADIiQIPE 
MGCFIDDSFIPNACELDFLLTEENNNQMLDDYCGIDDLDIIGLECDGPSELPDYDFSDVE 
IDLGLIGTTIDKYAFVDHIATTTPTPLNXACP* 1 
>G1053 (38. .538) 

GAAACTCTTACATACTCATATAAACCAAACTAAAACCATGATTCCGGCAGAAATCAACGG 
ATATTTCCAATATCTATCACCGGAATACAACGTAATAAACATGCCTTCATCTCCAACCTC 
TTCCTTAAACTACCTAAACGATTTGATCATCAACAACAACAACTATTCCTCATCATCCAA 
C^GTCAAGATCTCATGATAAGCAAC^U^CTC^CTTCCGACGAAGATCATCATCAAAGCAT 
CATGGTACTCGACGAGAGGAAACAGAGAAGGATGCTTTCGAACAGAGAATCTGCAAGGAG 
GTCAAGGATGAGGAAACAGAGACATCTTGATGAACTCTGGTCTCAGGTAATAAGGCTTCG 
CAACGAGAACAACTGTCTTATCGATAAGCTGAACCGCGTATCGGAGACTCAAAATTGTGT 
ATTGAAGGAGAACTCTAAACTCAAAGAAGAAGCTTCTGATCTCCGACAGCTTGTTTGTGA 
ACTGAAATCTAACAAGAACAACAACAATAGTTTTCCAAGAGAGTTTGAAGATAATTAGTA 
TTACTCAAA 

>G1053 Amino Acid Sequence (domain in AA coordinates: 74-120) 

MIPAEINGYFQYLSPEYNVINMPSSPTSSLNYLNDLIIN^ 

DEDHHQSIMVLDERKQRRl^SNRESARRS 

VSETQNCVLKENSKLKEEASDLRQIjVCEIjKSNKNNNNSFPREFEDN* 
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>G1137 (202 . .1248) 

TACTTCAGACTTCTACTCAAACCAGTCACGTAGTTGGT^ 

TCAATCTGTGATTGTTTTTCGTTCGTCTTTCTTTTACTATTTTCTCGAAAAGGACACAA^ 

AAGTATTGCATTCACTCAGTTGAGCAACTTAACAATCGTGTTGTACTTTTTGAAGTTCCC 

TTGAGCTAAACTGCTAAGAGCATGCCTCTGGATAAGAGGCAACGGGATTTGCCTCTGGGC 

TTAAGTCCTCAAGCTTGCTTCAAGGATATAGTAGGTCGGTCTGTCCTTCCTAGAATTCCT 

CTCCCTGAGCTTGGGAAACTATATGCAGCTAAGCTTCAGGCTCGCTGTTTGCAGCCACCA 

CCATTCCAGTCTTTGCTGTGCAGTCATGATAAGGAGTCTTATGGAAAAAGATTCT 

TCTGACATGCGGTCTTGGTGCGCTGCTGCTACTACTACTACTACTCCACTTGGAGCATTA 

GAGTCTTCTC^GAAAAGACTTTTGATATTCGATCAGTCAGGAGACCAGACTCGTCTATTA 

CAATGTCCATTTCCTCTACGGTTTCCATCTCATGCGGCTGCAGAACCAGTGAAACTCTCT 

GAGTTACAAGGTATAGAGAAAGCTTTCAAAGAAGATGGTGAAGAGTTTCACAAGAGTGAT 

GGAACAGAGTC^GAAATGCATGAAGACACTGAGGAGATCAATGCATTGCTATATTCAGAT 

GATGATTATGATGATGATTGCGAGAGTGATGATGAAGTAATGAGCACTGGTCACTCTCCT 

TATCGAAATGAAGGAGTTTGCAACAAAAGGGAATTAGAAGAAATCGATGGTCCTTGTAAA 

AGGCAGAAACTACTGGATAAGGTCAACAACATCAGCGACTTATCATCACTTGTGGGCACT 

GAGAGCTCCACACAACTGAATGGATCTTCCTTTCTTAAGGACAAAAAGCTCCCT 

AAAACCATATCGACCAAAGAGGACACTGGTTCTGGTCTGAGCAACGAGCAGTCGAAGAAA 

GACAAGATCCGCACAGCTCTGAAAATACTCGAGAGCGTAGTCCCTGGTGCAAAAGGAAAC 

GAAGCGCTCTTACTTCTGGACGAAGCAATTGATTACCTAAAGTTGCTGAAACGAGACTTA 

ATCTCCACAGAGGTTAAGAACCAAAGCTCCACCACTC^^ 

AAAGAGA(^(^TGGGGAACAAGAAATCTGCAGA(^GATAAGGCGTGAAAGATTCTGACG 

AGTTAAAACGTGTGAAGTGGGTTTTTGGGTACGTATCCTTGCACCAGCTTT 

>G1137 Amino Acid Sequence (domain in AA coordinates : 2 64-314) 

MPLDKRQRDLPIiGLSPQACFKDIVGRSVLPRIPLPELGKIiYAAKLQARCLQPPPFQSLIiC 

SHDKBSYGKRFSRSDMRSWCAAATTTTTPIjGALESSQKRLLIFDQSGDQTRLLQCPFPIjR 

FPSHAAAEPVKLSELQGIEKAFKEDGEEFHKSDGTESEMHEDTEEINAIjIiYSDDDYDDDC 

ESDDEVMSTGHSPYPNEGVCNKREIiEEIDGPCKRQKLLDKVNNISDLSSLVGTESSTQIjN 

GSSFLKDKKLPESKTISTKEDTGSGLSNEQSKKDKI^ 

EAIDYLKLLKRDLISTEVKNQS stthkspilllkettwgtrnlqtdka* 

>G1181 (113.. 1012) 

ctcgatcttttaacccccattato 
ttcgtgactttc^ggggacacttttc^ 

gccgccggttgacgcaatgattaccggagaatc^tcgtc^caaagatctatcccaacgcc 
gtttctcacaaaaacgtttaacctcgttgaagatagttccatcgacgatgttatctcatg 
gaacgaagatggttcctctttcatcgtatggaatccgacagatttcgctaaagatttgct 
tcctaaacacttcaaacacaacaatttctctagtttcgttcgtcagctcaacacttacgg 

attc^aaaaagttgtaccggatcgatgggagttt^^ 

aaaacgtcttctccgtgagatccaacgtcggaaaataacaacgacgcatg?^aacagttgt 
tgctccttcgtcggaacaacgaaaccagacgatggttgtatcaccgtcaaattccgggga 
agataataataataatcaggtgatgtcttcgtctccgtcgtcgtggtattgtcatcaaac 
gaagacgactgggaatggtggtttatcagtggagttattggaagagaacgagaagcttcg 
gagtcaaaacattcagctaaaccgtgagcttactcagatgaaatctatctgcgataatat 
ctatagtctcatgtcgaattacgtcggatctcagcccactgatcggagttattctcccgg 
aggtagtagtagtcaaccgatggagtttttaccggcgaagcggttttcggagatggagat 
tgaagaagaagaagaagcgagtccgaggttgtttggtgttccgattgggttaaaacggac 
gagaagtgaaggtgttcaggtgaagacgacggcggtggttggggaaaattccgatgagga 
gacgccgtggttgagacattataatcgaaccaatcagagagtttgtaattaaaaacgaac 
ggtttagatttgtggtgtagatatgtgcgcgaagtagacgattacagctttttaagacaa 
gcagagcacgtgtcceatctgtttcaagaagtttctgcaatcttgacttcttcttttaac 
actttgtgttttttattatttaattaataacaataaatgttctttttgagttttgttttc 

TTCAAAAATAGTTCGGCTGTTTCTAGACTTTCCTTTTTT 

>G1181 Amino Acid Sequence (domain in AA coordinates: 24-114) 

mnsppvdamitgesssqrsiptpfltktfnlvedssiddviswnedgssfivwnpto 
dllpkhfkhnnfssfvrqlntygfkkv^ 

twaps seqrnqtmws psnsgednnnnqvms s s ps s wychqtkttgngglis velleene 
kijrsqniqlnreltqmksicdniyslmsnyvgsqptdrsyspggsssqpmefiipakrfse 
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MEIEEEEEASPRIiFGVPIGLKP-TRSEGVQVKTTAVVGENSDEETPV^RHYNRTNQRVCN* 
>G1228 (63.. 1139) 

GCATTTATAATTACTCACTCATCTTCTTTTCATTACATTACATACCAAACAAGAGCTCTC 
AAATGGAAAGGTTTCAAGGACACATCAACCCCTGTTTCTTCGATCGAAAACCGGATGTGA 
GAAGCCTCGAGGTTCAAGGATTTGCAGAGGCTCAAAGCTTTGCTTTCAAAGAAAAAGAGG 
AAGAAAGCTTACAAGATACAGTTCCATTTCTACAGATGCTGCAAAGTGAAGACCCCTCAT 
CG IU'TITT TCAATCAAAGAGCCAAACTTTCTGACGCTACTGTCTCTTCAAACCCTCAAGG 
AGCCTTGGGAACTCGAAAGATATCTTTCACITGAGGATT 

AATCTGAGACCAACCGCTTCATGGAAGGAGCC^UVTCAAGCTGTGTCAAGCCAAGAAATTC • 

CCTTTAGCCAAGCAAACATGACACTCCCTTCTTCTACCTCAT 

CAAGACGAAAGCGCAAAATGAACCACTTGCTGCCTCAAGAA^ 

AGAGGAGGAAAACAAAACCAAGTAAAAACAATGAAGAGATTGAGAATCAAAGAATAAACC 
ACATTGCTGTTGAACGAAACAGAAGACGTCAAATGAACGAACATATCAACTCTCTCCGGG 
CCCTTCTCCCACCTTCCTAC^TC 

TAAACTACGTGAAGGTCCTCGAGCAAATCATACAATCTCTCGAATCGCAAAAGAGAACGC 

AACAACAAAGTAACAGTGAGGTAGTAGAAAACGCACTTAATCATCTCTCAGGCATTTCGT 

CGAACGACCTGTGGACAACTCTTGAAGATCAAACTTGTATCCCCAAAATCGAAGCTACA 

TGATACAAAACCATGTCAGCCTTAAAGTTCAATGTGAGAAGAAACAAGGAGAA 

AAGGAATCATATCACTTGAAAAGCTTAAACTCACTGTT^ 

CGTCTCATTCCTCTGTTTCTTATTCCTTCAACCTCAAGATGGAAGATGAGTGCGACTTAG 
AGTCAGCCGACGAGATTACGGCGGCTGTTCATCGGATTTTCGATATTCCGACAATTTGAT 
TAAACACATATAATTCCAAAAATATTAACAGCTGACAAAATGGTATCTTTGCGGCC 
>G1228 Amino Acid Sequence (domain in AA coordinates: 179-233) 
I^RFQGHINPCFFDRKPDVRSLEVQGFAEAQSFAFKEKEEESIiQDTVPFIjQMIjQSEDPSS 
FFS IKEPNFLTLLS LQTLKEPWELERYLSLEDSQFHS PVQSETNRFMEGANQAVS SQEIP 
FSQAI^TLPSSTSSPLSAHSRRKRKINHLLPQEMTREKRKRRKTKPSKNNEEI^ 
IAVERNRRRQMNEHINS LRALLPPS YIQRGDQAS I VGGAINYVKVLEQI IQSLES QKRTQ 
QQSNSEWENALNHIi SGI S SNDIjWTTLEDQTC I PKIEATVIQNHVS IiKVQCEKKQGQIiIiK 
G 1 1 SLEKLKLTVTiHLNT TTS SHS S VS YS FNLKMEDECDLES ADE ITAAVHRI FD I PT I * 
>G1277 (51.. 512) 

ATTCTAAAGTCCTCCTCTCGGAAAGTAAGAGACTCAACTTCCGAGCCGCCATGGACGCCG 
GAGTAGCAGTAAAAGCTGACGTGGCAGTCAAAATGAAGAGAGAAAGACCATTCAAAGGGA 
TCAGAATGAGAAAATGGGGGAAATGGGTTGCGGAGATTCGAGAACCCAACAAGCGTTCAA 
GACTTTGGCTCGGCTCTTACTCTACTCCCGAAGCGGCGGCGCGTGCATACGACACGGCTG 
TCTTTTACCTCAGAGGACCAACTGCTACGCTCAACTTCCCGGAGCTTCTGCCGTGTACCT 
CCGCCGAGGATATGT(^GCGGC^^CGAT(^GGAAAAAGGCGACGGAGGTGGGAGCT(^AG 
TAGATGCGATAGGGGCGACGGTGGTGCAGAACAACAAACGCCGCCGCGTTTTTAGTCAAA 
AGCGTGACTTTGGCGGCGGGTTATTAGAGCTTGTTGACTTGAACAAGTTACCTGACCCGG 
AAAATCTCGATGATGATTTGGTGGGAAAATAGACTGAAAAATAATAATAAAATATCTTAC 
AATGGTGGCTGTAGCTATCGTACGCGGAATGCTTGGGCTTGTGTTATATGACTACGTGGT 
TACGGAAAGATTCCTCTGTTTCGTCATTGTATTAAAATTTAATCCCACAAGTCAAACATA 
CTGTACATTATTCTTAATTTAGTATTTTCTTATTAATATCTATCATTTGTTTGGTGAACA 
CCAGAATATTAGACTATTAATGTAACGAGTTTTTAATATTTCGATCATAATAACACCAAG 
CTAGTTAAAGGTTAATATCTTGTTACGAAGTCTTGAGTAAGTTCAATTGTCATATATATG 
TAACGGAAGAGGTTCGTTCGGGTCCCAAGTGAAGTGGATCAAAGGTGACTTCAC^TAAAA 
AATAAAAAAAA 

>G1277 Amino Acid Sequence (domain in AA coordinates: 18-85) 
MDAGVAVKADVAVKMKIU3RPFKGIRMRKWGK^ 

DTAVTYLRGPTATLNFPELLPCTSAEDMSAATIRKKATEVGAQVDAIGATW 

FSQKRDFGGGIiIiELVDLNKLPDPENLDDDLVGK* 

>G1309 (53.. 859) 

CGTCGACCTCTTAATTAAGACGACTTGAGAGAGAAAGAAAGATACGTGGAAGATGACCAA 
ATCTGGAGAGAGACCAAAACAGAGACAGAGGAAAGGGTTATGGTCACCTGAAGAAGACCA 
GAAGCTCAAGAGTTTC^TCCTCTCTCGTGGCC^^ 

AGCTGGATTGCAAAGGAATGGGAAAAGCTGCAGATTAAGGTGGATTAATTACCTAAGACC 
AGGACTAAAGAGGGGGTCGTTTAGTGAAGAAGAAGAAGAGACCATCTTGACTTTACATTC 
TTCCTTGGGTAACAAGTGGTCTCGGATTGCAAAATATTTACCGGGAAGAACAGACAACGA 
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GATTAAGAACTATTGGCATTCCTATCTGAAGAAGAGATGGCTCAAATCTCAACCACAACT 
CAAAAGCCAAATATCAGACCTCACAGAATCTCCTTCTTCACTACTTTCTTGCGGGAAAAG 
AAATCTGGAAACCGAAACCCTAGATCACGTGATCTCCT^CCAGAAATTTTCAGAGAATCC 
AACTTCATCACCATCCAAAGAAAGCAACAACAACATGATCATGAA.CAACAGTAATAACTT 
GCCTAAACTGTTCTTCTCTGAGTGGATCAGTTCTTCAAATCCACACATCGATTACTCCTC 
TGCTTTTACAGATTCGAAGCAGATTAA 

GATGATGATCAATAACAACAACTACTCTTCACTTGAGGATGTCATGCTCCGTACAGATTT 
TTTGCAGCCTGATCATGAATATGCAAATTATTATTCTTC 

TGACCAAAATTATGTCTAAGAAGAGTGAATATGATCGTAAGAGGAACATAAGCTAGTTAC 
TTGTGTTACAGC 

>G1309 Amino Acid Sequence (domain in AA coordinates: 9-114) 

MTKSGERPKQRQRKGLWSPEEDQKLKSFILSRGHAOTTTV^ 

LRPGLKRGSFSEEEEETILTLHSSLGNKWSRIAKYIiPGRTD^ 

PQLKSQISDLTESPSSLLSCGKRlTLETETLDH\rESFQKFSENPTSSPSKESNNl^IIVINNS 
NNLPKLFFSEWISSSNPHIDYSSAFTDSKHINETQDQINEEEVMMIHNl^ 
TDFLQPDHEYAMYYS SGDFF INSDQNYV* 
>G1314 (1. .990) 

ATGGGAAGAGCTCCGTGTTGCGACAAGACAAAAGTGAAGCGAGGGCCTTGGTCGCCTGAA 
GAAGACTCTAAACTTAGAGATTACATTGAAAAGTATGGTAATGGTGGAAATTGGATCTCT 
TTCCCCCTCAAAGCCGGTTTGAGGAGATGTGGGAAGAGTTGTAGACTGAGGTGGCTAAAC 
TATTTGAGACCAAACATAAAGCATGGTGACTTCTCTGAGGAAGAAGACAGGATCATTTTT 
AGTCTCTTCGCTGCCATAGGAAGCAGGTGGTCT^ATAATAGCAGCTCATCTACCGGGACGA 
ACAGACAACGACATAAAAAACTATTGGAACACAAAGCTAAGGAAGAAACTCTTGTCTTCT 
TCCTCTGATTCATCATCATCAGCCATGGCTTCTCCTTATCTAAACCCTATTTCTCAGGAT 
GTGAAAAGACCAACCTCACCAACAACAATCCCZATCTTCTTCTTACAATCCGTATGCTGAA 
AACCCTAATCAATACCCAACAAAATCCCTCATCTCCAGCATCAATGGCTTCGAAGCTGGT 
GACAAACAGATAATTTCCTATATTAACCCTAATTATCCTCAAGATCTCTATCTCTCGGAC 
AGCAACAAG^CACCTCGAACGCAAATGGTTTCTTGCT^ 

TACAAGAACCACACCAGTTTTTCTTCAGACGTCAATGGGATAAGATCAGAGATTATGATG 
AAGCAAGAAGAGATAATGATGATGATGATGATAGACCACCACATTGACCAGAGGACAAAA 
GGGTACAATGGGGAATTCACACAAGGGTATTATAATTACTACAATGGGCATGGGGATTTG 
AAGCAAATGATTAGTGGAACAGGCACTAATTCTAACATAAACATGGGTGGTTCAGGTTCA 
TCTTCTAGTTCGATAAGCAACCTAGCTGAGAACAAAAGCAGTGGTAGCCTCCTACTAGAA 
TACAAATGCTTGCCCTATTTCTACTCCTAG 

>G1314 Amino Acid Sequence (domain in AA coordinates: 14-116) 

mgrapccdktkvkrgpws peeds klrdyi ekygnggnw i s fplkaglrrcgks crlrwiin 

ylrpnikhgdfseeedriifslfaaigsrwsiiaahlpgrtdi^ik^^ 

ssdssssai^pylnpisqdvkrptspttipsssynpyaenpnqyptksiilssingfeag 

DKQIISYINPNYPQDLYLSDSNT^TSNANGFIjIiNHNMCDQYKNHTSFSS^ 
KQEEIMMMMMIDHHIDQRTKGYNGEFTQGYYNYYNGHGDLKQMISGTGTNSNINMGGSGS 
SSSSI SNLAENKS SGSLL.LE YKCLPYFYS * 
>G1317 (1..849) 

ATGGGAAGATCACCTTGTTGTGATAAAAATGGAGTGAAGAAGGGACCATGGACTGCTGAG 
GAGGATCAGAAACTCATCGATTATATTCGATTTCATGGTCCTGGCAATTGGCGTACGCTC 
CCCAAAAATGCTGGACTCCATAGATGTGGAAAAAGCTGCCGTCTTCGATGGACGAATTAT 
CTAAGACCGGACATCAAGAGAGGAAGATTCTCGTTCGAGGAAGAAGAAACTATCATTCAG 
CTACACAGTGTTATGGGAAACAAGTGGTCAGCAATAGCCGCTCGTCTACCAGGGAGGACC 
GATAACGAAATAAAAAACC^TTGGAACACTCACATCCGCAAGAGACTTGTAAGGAGTGGT 
ATCGACCCTGTTACTCATTCTCC^CGCCTTGATCTTCTTGATTTGTCCTCACTTTTGAGT 
GCACTTTTCAACCAGeCAAACTTTTCAGCAGTTGCAAC^CATGCGT 

CCTGATGTATTGAGGTTGGCCTCTCTACTACTGCCACTTCAAAACCCTAATCCAGTTTAC 
CCATCGAACCTCGACCAAAATCTTCAAACTCCAAATACATCATC^GAATCGTCTCAACCA 
CAAGCTGAGACTAGTAC^GTCCCAACAAACTATGAAACTTCATCATTGGAGCCTATGAAC 
GCAAGACTCGACGACX3TTGGTCTTGCAGATGTATTACCACCTTTGTCAGAGAGTTTTGAC 
TTAGACTCGCTCATGTCAACGCCAATGTCTTCTCCACGAC^^AATAGCATTGAAGCAGAA 
ACCAACTCCAGCACTTTCTTCGACTTTGGAATTCCGGAAGATTTCATCTTAGATGACTTT 

ATGTTTTAA 
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>G1317 Amino Acid Sequence (conserved domain in AA coordinates : 13-118) 
MGRSPCCDKNGVKKGPWTAEEDQKLIDYIRF^ 

LRPD I KRGRFS FEEEET I IQLHS VMGNKWS AI AARLPGRTDNE I KNHV7NTHIRKRLVRSG 

IDPVTTHSPRI^LLDLSSLLSALFNQPNFSAVATHASSLLNPDv^ 

PSNLDQNLQTPOTSSESSQPQAETSTVT>TNYETSSL^ 

LDSLMSTPMSSPRQNSIEAETNSSTFFDFGIPEDFILDDFMF* 

>G1323 (49. .870) 

AAGAGGGAATCTCAAAAGTGTGTGTCTGTGAGAGAGGAGAGAGAGAATATGGGCAAAGGA 
AGAGCACCATGTTGTGACAAAACCAAAGTGAAGAGAGGACCATGGAGCCATGATGAAGAC 
TTGAAACTCATCTCTTTCATTCACAAGAATGGTCATGAGAATTGGAGATCTCTCCCAAAG 
CAAGCTGGATTGTTGAGGTGTGGCAAGAGTTGTCGTCTGCGATGGATTAATTACC^CZAGA 
C CTG ATGTGAAACGTGG CAATTTCAGTGCAGAGGAAGAAG ACACCATCATCAAACTTC AC 
CAGAGCTTTGGTAACAAGTGGTCGAAGATTGCTTCTAAGCTGCCTGGAAGAACAGACAAT 
GAG ATCAAGAATGTGTGGCATACACATCTCAAGAAAAGATTGAG CTCGGAAACTAAC CTT 
AATGCCGATGAAGCGGGTTCAAAAGGTTCTTTGAATGAAGAAGAGAACTCTCAAGAGTCA 
TCTCCAAATGCTTCAATGTCTTTTGCTGGTTCCAACATTTC7UVGCAAAGACGATGATGCA 
CAGATAAGTCAAATGTTTGAGCACATTCTAACTTATAGCGAGTTTACGGGGATGTTACAA 
GAGGTAGACAAACCAGAGCTGCTGGAGATGCCTTTTGATTTAGATCCTGACATTTGGAGT 
TTC^TAGATGGTT<^GACT(^TTCCAACAACCAGAGAACAGAGCTCTTCAAGAGTCTGAA 
GAAGATGAAGTTGATAAATGGTTTAAGCACCTGGAAAGCGAACTCGGGTTAGAAGAAAAC 
GATAACCAACAACAACAACAGCATAAACAGGGAACAGAAGATGAACATTCATCATCACTC 
TTGGAGAGTTACGAGCTCCTCATACATTAATGAAGC 

GAAAATGGAATTATTAGCTAACTTATTGGCATTATTAGTATATAAGCAAGATCAGATAGG 
CGCATGTAGTAGCAACAACGAAGAAACGTCGAATTGTAGACAAAATGTAGATATTACAGA 
GTTGAAAGATTGTATTTTGCAAATGATTGCTTTGTAGTGAAATCAAGTTATCACAAAAAA 
AAAAAAAA 

>G1323 Amino Acid Sequence (domain in AA coordinates: 15-116) 

MGKGRAPCCDKTKVKRGPWSHDEDIiKL I S F IHKNGHENWRSLPKQAGLLRCGKS CRLRWI 

NYLRPD VKRGNFS AEEEDTI I KLHQS FGNKWS KI AS KL PGRTDNE I KNVWHTHIiKKRLS S 

ETNIiNADEAGSKGSLNBEENSQESSPNASMSFAGSNISSKDDDAQISQMFEHILTYSEFT 

GMLQEVX»KPELLEMPFDIjDPDIWSFIDGSDSFQQPENRALQESEEDEVl)KWFKHLES 

LEENDNQQQQQHKQGTEDEHSSSLLESYELLIH* 

>G1332 (1..606) 

ATGGAATGCAAAAGAGAAGAAGGGAAGTCTTACGTGAAGAGAGGGTTGTGGAAACCAGAA 
GAAGATATGATATTAAAAAGCTATGTTGAGACTCATGGTGAAGGAAACTGGGCAGACATT 
TCTCGTAGATCCGGGTTGAAGAGAGGAGGAAAAAGCTGTAGGCTGAGATGGAAGAACTAT 
CTAAGACGAAATATCAAAAGAGGAAGCATGTCA^ 

ATGCATAAGCTTCTTGGAAAC^GATGGTCGTTGATCGCTGGTCGCCTTCCAGGTCGTACT 
GACAATGAAGTGAAGAACTACTGGAATACT CATTTGAACAAGAAAC CTAATTCCCGAAAA 
(^GAATGC^CCTGAATCAATCGTCGGCGCCACTCCTTTCACGGATAAGCCAGTTATGTCT 
ACAGAACTGAGAAGAAGCCATGGAGAAGGAGGAGAAGAGGAGAGCAATACCTGGATGGAG 
GAGACCAACCACTTTGGCTATGACGTCCACGTAGGATCTCCCTTGCCACTTATTTCCCAC 
TACCCAGACAACACTCTCGTGTTTGACCCATGTTTTTCCTTTACCGATTTCTTTCCTCTG 
CTTTAG 

>G1332 Amino Acid Sequence (conserved domain in AA coordinates : 13-116) 
meckreegksyvkrglwkpeedmilksyvethgeg;^^ 

LRPNIKRGSMSPQEQDLIIRMHKLLGNRWSLIAGRIiPGRTDNEVKNYWNT^ 
QNAPESIVGATPFTBKPVMSTEIjRRSHGEGGEEESNTWMEETNHFG 
YPDNTLVFDPCFSFTDFFPLL* 
>G1334 (76.-885) 

ATAGCTCCCAACTAATAGGAATCTC^GCTTCTCACTCTCTCTTGTTTTTCCATTGGACT 
TTTGGAACATAAGCTATGC^AACTGAGGAGCTTI^GTCGCCACCACAGACTCCTTGGTGG 
AATGCTTTTGGATCTCAGCCGTTGACTACAGAGAGCCTTTCCGGCGAAGCTTCTGATTCA 
TTCACCGGAGTTAAGGGAGTTACTACGGAGGGAGAACAAGGTGTGGTGGATAAACAAACT 
TCTACAACTCTCTTCACTTTCTCACCTGGTGGTGAAAAGAGTTCAAGAGATGTGCCAAAG 
CCTCATGTTGCTTTCGCGATGCAATCAGCTTGCTTCGAG^ 

ATGTACACAAAGCATCCTCATGTTGAACAATACTATGGAGTTGTTTCAGCATACGGATCT 
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CAGAGGTCTTCGGGCCGAGTAATGATTCCACTGAAGATGGAGACAGAAGAAGATGGTACC 

ATCTATGTGAACTCAAAGCAGTACCATGGAATTATCAGGCGACGCCAGTCCCGAGCAAAG 

GCTGAAAAACTGAGTAGATGCCGTAAGCCATATATGCATCACTCACGCCATCTCCATGCT 

ATGCGCCGTCCTAGAGGATCTGGCGGGCGTTTCTTGAACACCAAGACAGCTGATGCGGCT 

AAGCAGTCTAAGCCGAGTAATTCTCAGAGTTCTGAAGTCTTTCATCCGGAAAATGAGACC 

ATAAACTCATCGAGGGAAGC^AATGAGTC^TVATCTCTCGGATTCTGCAGTTACAAGTATO 

GATTACTTTCTAAGTTCGTCGGCTTATTCTCCTGGTGGCATGGTCATGCCTATCAAGTGG 

AATGCAGCAGCAATGGATATTGGCTGCTGCAAACTTAATATATGATCAGCA 

CAAGAC^TGATTGGTCACCAGTCCTTTTGTCTTGTCCCTTATCTTTCAG 

GAGAACTTGTGTCTTGGAAAAAAGACATTGAGTTTCCTTGGTTTATAAGATTGGTCCTTT 

TACCATCCGTTTGGCTGTAAACAGGGAAATCATC^ 

ATCTTCGTCTGTTTTCTTCTACGCATCTTCATAAGATCTCTGAACTAGTGAATAACATTT 

CCTAGCATCATGTTTCAACTAGTGTGTGTTGTAAGAAAC^CTGCCTTATTTCCAGATGAT 

GTATTGTGTGTAACGTGTTTATGAAAGAAACGTAAGACTT 

AAAAAAAAAAAAAA 

>G1334 Amino Acid Sequence (domain in AA coordinates: 18-190) 
MQTEELLSPPQTPWWNAPGSQPLTTESLSGEASDSFTGVKAVTTEAEQGVVDKQTSTTIiF 
TFS PGGEKS SRDVPKPHVAFAMQS ACFEFGFAQPMMYTKHPHVEQYYGVVS AYGSQRS SG 
RVMIPLKMETEEDGTIYVNSKQYHGIIRRRQSRAKAEK^^ 

GSGGRFLOTKTADAAKQSKPSNSQSSEVFHPENETINSSREAIJESNLSDSAW 
SSAYSPGGMVMPIKWNAAAMDIGCCKLNI * 
>G1381 (32. .802) 

CAGCTTTAACACTACTCTCTCTCTCTCTCAAATGGGAAAACAAATCAACATAGAGAGTAG 
TGCTACTCATCATCAAGACAATATTGTTTCCGTTATAACAGCCACGATATCCTCCTCCTC 
CGTCGTAACGTCTTCGTCAGACTCTTGGTCTACCTCCAAAAGATCGTTAGTGCAAGACAA 
TGACTCCGGAGGGAAACGGCGGAAGAGCAACGTTAGTGATGATAACAAGAATCCGACGTC 
GTATAGAGGAGTGAGGATGAGGAGTTGGGGAAAATGGGTGTCGGAGATTAGAGAGCCGAG 
GAAGAAATCAAGAATATGGCTTGGCACTTATCCAACGGCAGAGATGGCAGCTCGTGCTCA 
TGATGTGGCGGCTTTAGCTATTAAAGGCAACTCCGGTTTTCTTAATTTCCCTGAATTATC 
CGGTTTGCTTCCTCGTCCGGTTAGCTGCTCTCCTAAGGATATACAAGCTGCAGCTACCAA 
AGCCGCCGAAGCAACCACGTGGCACAAACCGGTTATCGATAAGAAATTAGCTGATGAGCT 
AAGCCACTCTGAGTTGTTGTCTACCGCTCAGTCTTCGACTTCTAGTAGTTTCGTGTTTTC 
TTCGGACACGTCGGAGACTTCTAGTACGGACAAGGAAAGCAACGAAGAGACGGTGTTTGA 
TTTGCCGGACCTTTTCACGGACGGGCTTATGAACCCAA 

CGGCACCTTTACGTGGCAGCTTTACGGAGAGGAGGATGTAGGGTTCAGGTTTGAAGAGCC 
GTTTAATTGGCAAAATGACTAAACCGCCCTCCACTTGCTTACTGTAATTACT'AACATATA 
ATTTTCTTGATAAAGAACATATATTTC 

TCTCTTTTCTTGTTTCTACATCTGAGTATATTGTCACTATGTGAAAAAATTGATCTCGTT 

TTGAATATTTACTTTTCAAAATTGAAGTAACGCAAGTGATTGATAAAAAAAAAAAAA 

>G13 81 Amino Acid Sequence (domain in AA coordinates: TBD) 

MGKQINIESSATHHQDNIVSVITATISSSSWTSSSDSWSTSKRSLVQDNDSGGKRRKSN 

VSDDNKNPTSYRGVRMRSWGKWSEIREPRKKSRIWLGTYPTAEMAARAm 

SGFLNFPELSGLLPRPVSCSPKDIQAAATKAAEATTWHKPVIDKKLADELSHSE 

SSTSSSFVFSSDTSETSSTDKES^ETVFDLPDLFTDGLMNPNDAFCLCNGTFTWQLYGE 

EDVGFRFEEPFNWQND* 

>G1382 (90. .1763) 

CTCTCATTTCGCCATAGCTGAGAGCTTCTTCTACTTTCCCTTAGCTTCTTTTTTCCTTCA 
TTTTTGTTCTACCGCTGCGAATCTCTGAAATGAACCCTCAAGCTAATGACCGGAAGGAGT 
TTCAGGGAGATTGTTCGGCGACGGGAGATCTCACGGCAAAGCACGATTCAGCTGGAGGAA 
ACGGAGGTGGAGGTGeTAGGTATAAGCTGATGTCACCGGCCAAGCTTCCGATCTCGAGGT 
CGACTGATATCACGATTCCTCCTGGGTTGAGTCCGACTTCGTTTTTGGAATCTCCTGTTT 

CAGTGCACATTTCTGCTAGCTCAAGTTCTT^ 

TTACTGAGCAGAAGTCCAGTGAATTTGAGTTCAGACCTCCTGCATCAAATATGGTATATG 
CAGAGCTTGGCAAGATTAGAAGTGAGCCACC^GTACATTTTCAAGGCCAGGGCCATGGAT 
CCTCACACTCACCTTCTTCGATCAGTGATGCTGCAGGTTCCTCAAGTGAGCTAAGCCGGC 
CAACTCCTCCTTGTCAGATGACACCAACGAGCTCAGATATTCCGGCTGGATCTGATCAAG 
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AGGAATCAATCCAGACTTCCCAAAATGACTCCAGAGGAAGCACTCCATCCATCTTGGCTG 

ATGATGGTTATAACTGGAGAAAATATGGTCAAAAGCATGTCAAAGGGAGTGAATTTCCCC 

GGAGCTATTATATU^TGTACACATCCTAATTGTGAAGTGAAAAAGTTATTTGAAAGATCTC 

ATGATGGGCAGATCACCGATATTATATACAAGGGTACACATGACCATCCTAAACCTCAAC 

CTGGTCGCCGAAACTCTGGTGGTATGGCTGCACAAGAAGAAAGGCTAGACAAGTATCCTT 

CTTCAACTGGCCGAGATGAGAAGGGATCTGGCGTCTACAACTTGTCTAACCCCAATGAAC 

AAACTGGTAACCCTGAAGTACCTCCTATCTCAGCATCTGACGATGGTGGAGAAGCGGCAG 

CGTCAAATAGGAATAAAGATGAGCCGGACGATGATGATCCATTCTCAAAACGGAGGAGGA 

TGGAGGGTGCGATGGAAATAACTCCACTAGTGAAACCCATCCGGGAGCCTCGGGTTGTTG 

TTCAAACTCTGAGTGAGGTTGACATTCTGGATGATGGTTATAGATGGCGCAAATATGGGC 

AGAAAGTCGTAAGGGGGAACCCAAATCCCAGGAGCTACTACAAATGCACAGCTCATGGAT 

GCCCAGTGAGAAAACACGTGGAGAGAGCATCACATGATCCAAAAGCTGTAATAACAACAT 

ACGAAGGCAAACACGATCATGATGTTCCCACTTCA7UVGTCTAGCAGCAATCACGAAATCC 

AGCCTCGGTTCAGACCAGATGAAACAGACACCATCAGCCTCAATCTTGGTGTTGGAATCT 

CATCTGATGGACCTAACCACGCTTCCAACGAAGATCAGCACCAGAATCAACAACl^ 

ACCAAACTGA£CGAAATGGAGTCAATT 

ACTATGCTAGCTTAAATAGCGGTATGAATCAGTACGGCCAGAGAGAAACAAAGAACGAGA 
CTCAAAATGGTGACATCTCGTCCTTGAACAATTC^TCTTACCCATATCCGCCCAACATGG 
GGAGAGTAGAATCGGGTCCGTAAAACAAAAAGTAAGCAACATTATGTACGGGATCTTCTT 
AGGTTAGGAATGGGACGAGGCCTTGTTCTATATAATTCCTATTTCTTCACAGAGAGCTGA 
TCTTGATTCAAACTATCTCCACCATATATATTTGTTTGTGTCACCTGTATTGAGTTCCAA 
AAATGTTATGTAAAAATACACAACAA.GATGTTAATGCTTTTATTTAAACAAGAAACAGCA 
ATATTACTACAAAAAAAAAAAAAAAAAA 

>G1382 Amino Acid Sequence (domain in AA coordinates: 210-266, 385-437) 

MNPQANDRKEFQGDCSATGDIjTAKHDS AGGNGGGGARYKLMSPAKLPI srstditi ppgl 

SPTSFLESPVFISNIKPEPSPTTGSIiFKPRPVHISASSSSYTGRGFHQNTFTEQKSSEFE 

FRPPASNMVYAELGKIRSEPPVHFQGQGHGSSHSPS S ISDAAGSSSELSRPTPPCQMTPT 

SSDIPAGSDQEESIQTSQNDSRGSTPSIIjADDGYITWRKYGQKHVKGSEFPRSYYKCTHPN 

CEVKKLFERSHDGQITDIIYKGTHDHPKPQPGRRNSGGMAAQEERIiDKYPSSTGRDEKGS 

GVYNLSNPNEQTGITOEVPPISASDDGGRAAASNR]^ 

VICPIREPRVWQTLSEVDILDDGYRWRKYGQKVVRGNPNPRSYYKCTAHGCPVRKB^TERA 
SHDPKAVITTYEGKHDHDVPTSKSSSNHEIQPRFRP 

EHQHQNQQLWQTHPNGVNFRF VHAS PMS S YYASL»NS GMNQYGQRETKNETQNGD I S SIjN 
NS S YPYPPNMGRVQSGP * 
>G1435 (8.. 90.4) 

GTGAAACATGGGGAAGGAAGTTATGGTGAGCGATTACGGTGACGACGACGGAGAAGACGC 
CGGCGGCGGCGATGAATATAGGATTCCGGAATGGGAAATTGGTTTACCCAACGGAGATGA 
TTTGACTCCGTTATCTCAATATCTAGTCCCGTCGATTCTCGCGTTAGCTTTCAGCATGAT 
CCCAGAACGAAGCCGTACAATTCACGACGTCAATCGCGCGTCGCAAATCACGCTCTCTTC 
GTTGAGAAGCAGTACCAATGCTTCGTCTGTGATGGAGGAGGTCGTGGATCGAGTTGAATC 
GAGTGTTCCAGGATCAGATCCGAAGAAACAGAAGAAATCGGATGGTGGTGAAGCAGCGGC 
GGTGGAGGATTCCACGGCGGAGGAAGGAGACTCCGGGCCTGAAGACGCGTCTGGGAAGAC 
ATCGAAACGACCGCGTTTAGTGTGGACACCGCAGCTACACAAGAGATTTGTGGACGTTGT 
GGCTCATCTAGGGATTAAAAACGCAGTGCCGAAGACGATTATGCAGCTGATGAACGTGGA 
AGGACTTACTCGTGAGAACGTTGCGTCTCATTTGGAGAAATATAGGCTTTACCTTAAACG 
GATTCAAGGATTGACGACGGAAGAAGATCCTTATTCGTCGTCGGATGAGCTCTTCTCTTC 
AACGCCGGTTCCTCCACAGAGCTTTCAAGACGGCGGAGGAAGTAACGGAAAGTTGGGGGT 
TCCGGTTCCGGTTGeGTCGATGGTGCCTATTCCAGGCTATGGGAATCAAATGGGTATGCA 
AGGATATTATCAACAGTATAGTAACCATGGCAATGAATCAAACCAATATATGATGCAGCA 
GAATAAGTTTGGAACAATGGTGACATATCCTTCTGTTGGTGGTGGTGACGTGAATGACAA 
GTAAATGGATCTTAAAGGTCTATAATTTGCTCTACAGAGAGATACTGGTTCTTGGCTTAT 
GGTTTATTTTCCCACTTCATGAGGTTGTTGTGACTTTT^ 

AGTCTTTATTGCCTTTGTATAGAAAATGATTTCGAGAAAATCACTGGGAAGCTTGGTATT 
GTTGGAGGATGAAGCCTTCTATGAATGATTTAGTTTCCTACTGTCTCCATTCTTTATGAG 
GTAATAAAGCCTTCTTTTGCTCATCGCTTGTAGTCTTCTTAAATTCAAGACZAGCGTCACA 
TGTTTGTTCGGTTATGTTAATTGTTTCTTTCTTTGGATAATGAAGATAGCATCAGGTCTC 
ATGTCTCCTCACTTTGATAAA 
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>G1435 Amino Acid Sequence (domain in AA coordinates: 146-194) 
MGKEVMVSDYGDDDGEDAGGGDEYRIPEWE IGLPNGDDLTPLSQYLVPS ILAIiAFSMI PE 
RSRTIHDVNRASQITLSSLRSSTNASSVMEBVVDRVESSVPGSDPKKQKKSDGGEAAAVE 
DSTAEEGDSGPEDASGKTSKRPRIiVWTPQLHXRFVDVVAHLGIKNAVPKTIMQLM^^ 
TRENVASHLQKYRLYLKRIQGLTTEEDPYSSSDQLFSSTPVPPQSPQDGGGSNGKLGVPV 
PVP SMVP I PGYGNQMGMQGYYQQYSNHGl^SNQYMMQQNKFGTMVTYPSVGGGDVNDK* 
>G1537 (1..783) 

ATGGAAAACGAAGTAAACGCAGGAACAGCAAGCAGTTCAAGATGGAACCCAACGAAAGAT 
CAGATCACGCTACTGGAAAATCnTTACAAGGAAGGAATACGAACTCCGAGCGCCGATCAG 
ATTCAGCAGATGACCGGTAGGCTTCGTGCGTACGGCCATATCGAAGGTAAAAACGTCTTT 
TACTGGTTCCAGAACCATAAGGCTAGGCAACGCCAAAAGCAGAAACAGGAGCGCATGGCT 
TACTTCAATCGCCTCCTCCACAAAACCTCCCGTTTCTTCTACCCCCCTCCTTGCTCAAAC 
GTGGGTTGTGTGAGTCCGTACTATTTAGAGCAAGC^ 

GGAAGTGTATACACAAACGATCTTCTTCACAGAAACAATGTGATGATTCCAAGTGGTGGC 
TACGAGAAACGGACAGTCACACAACAT(^GAAACAACTTTCAGAC^ 

GCCACAAGAATGCCAATTTCTCCGAGTTCACTCAGATTTGACAGATTTGCCCTCCGTGAT 
AACTGTTATGCCGGTGAGGACATTAACGTCAATTCGAGTGGACGGAAAAC^CTCCCTCTT 
TTTCCTCTTCAGCCTTTGAATGCAAGTAATGCTGATGGTATGGGAAGTTCCAGTTTTGCC 
CTTGGTAGTGATTCTCCGGTGGATTGTTCTAGCGATGGAGCCGGCCGAGAGCAGCCGTTT 
ATTGATTTCTTTTCTGGTGGTTCTACTTCTACTCGTTTCGATAGTAATGGTAATGGGTTG 
TAA 

>G1537 Amino Acid Sequence (domain in AA coordinates: 14-74) 
MENEVNAGTASSSRWNPTKDQITIiLENIjYKEGIRTPSADQIQQITGRLRAYGHIEGKIWF 
YWFQISTHKARQRQKQKQERMAYFNRLIjHKTSRFFYPPPCSNVGCVSPYYLQQA^ 
GS\TYTNDIjIjHRNNVMIPSGGYEKRTVTQHQKQL^ 

NCYAGEDINVNSSGRKTLPLFPLQPLNASNADGMGSSSFAIjGSDSPVDCSSDGAGREQPF 

idffsggststrfdsngngl* 

>G1545 (67.-729) 

C^TCACC^UVTCTTTTGAATCTAAGAGAGAGAAGAAGAAGAAGGTCTAGAGAACGAAAAGA 

AGAAACATGAATAACCAGAATGTAGATGATCATAATCTTCTACTCATTTCTCAATTGTAC 

CCTAATGTCTATACTCCATTAGTACCACAACAAGGAGGAGAAGCAAAACCAACACGGCGG 

AGGAAAAGGAAGAGCAAGAGTGTTGTGGTGGCAGAGGAGGGTGAAAACGAAGGCAATGGG 

TGGTTTAGAAAGAGAAAATTGAGTGATGAGCAAGTAAGAATGTTGGAGATTAGCTTTGAA 

GACGATCATAAGCTTGAATCCGAGAGGAAAGATCGGCTTGCTTCTGAGTTAGGGCTTGAT 

CCTCX5TCAAGTCGCCGTCTGGTTCCAAAACCGCCGTGCACGGTGGAAGAACAAACGAGTC 

GAGGATGAATACACTAAACTCAAGAATGCATACGAAACCACCGTCGTTGAGAAATGTCGT 

CTTGATTCTGAGGTTATTCACCTAAAGGAACAACTTTACGAGGCTGAAAGAGAGATCCAA 

CGGCTTGCAAAAAGAGTTGAAGGAACTTTAAGTAACAGTCCTATCTCATCCTCTGTGACC 

ATTGAAGCCAATCATACGACACCGTTTTTTGGAGATTACGACATCGGATTTGACGGTGAG 

GCTGACGAGAACTTGCTCTACTCGCCAGATTACATTGATGGATTAGACTGGATGAGCCAA 

TTTATGTAAAAAACTATAAGCTAATCTATTTTCAGTCGTAGTATAG 

>G1545 Amino Acid Sequence (domain in AA coordinates: 54-117) 

MIOTQNVDDHNIjLLI S QLYPNVYTPLVPQQGGEAKPTRRRKRKSKS VVVAEEGENEGNGWF 

RKRKLSDEQVRMLE I S FEDDHKLES ERKDRIiASELGLDPRQVAVWFQNRRARWKNKRVED 

E YTKLKNAYETTVVEKCRLDS E VIHLKEQL YEAERE I QRIjAKRVEGTLSNS PISS S VT IE 

ANHTTPFFGDYDIGFDGEADElTLLYSPDYIDGIiDWMSQFM* 

>G1641 (1..867) 

ATGGAGGTTATGAGACCGTCGACGTCACACGTGTCAGGTGGGAACTGGCTCATGGAGGAA 
ACTAAGAGCGGCGTCGCAGCTTCTGGTGAAGGTGCCACGTGGACGGCGGCAGAGAACAAG 
GCATTCGAGAATGCTTTGGCGGTTTACGACGACAACACTCCTGATCGGTGGCAGAAGGTG 
GCTGCGGTGATTCCGGGGAAGACAGTGAGTGACGTAATTAGACAGTATAACGATTTGGAA 
GCTGATGTCAGCAGCATCGAGGCCGGTTTAATCCCGGTCCCCGGTTACATCACCTCGCCG 
CCTTTCACTCTAGATTGGGCCGGCGGCGGTGGCGGATGTAACGGGTTTAAACCGGGTCAT 
CAGGTTTGTAATAAACGGTCGCAGGCCGGTAGATCGCCGGAGCTGGAGCGGAAGAAAGGC 
GTTCCTTGGACGGAGGAAGAACACAAGCTATTTCTAATGGGTTTGAAGAAATATGGGAAA 
GGAGATTGGAGT^AACATATCTCGGAACTTTGTGATAACGCGAACGCCAACACAAGTAGCT 
AGCC^CXSCCCAAAAGTACTTCATCCGGCAACT^ 
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AGCATTCACGACATAACCACCGTAAATCTCGAAGAGGAGGCTTCTTTGGAGACCAATAAG 
AGCTCCATTGTTGTTGGAGATCAGCGTTCAAGGCTAACCGCGTTTCCTTGGAACCAAACG 
GACAACAATGGAACACAGGCAGACG CTTTCAATATAACGATTG GAAAC GCT ATT AGTGGC 
GTTCATTC^TACGGCCAGGTTATGATTGGAGGGTATAACAATGCAGATTCTTGCTATGAC 

GCCCAAAACACAATGTTTCAACTATAG 

>G1641 Amino Acid Sequence (domain in AA coordinates: 139-200) 
MEVMRPSTSHVSGGNTOjMEETKSGVAASGEGATWTA 

AAVIPGKTVSDVIRQYlTOIiEADVSSIEAGLIPVPGYITSPPFTIiDWAGGGGGCNGFKPGH 
QVCNKRSQAGRS PELERKKGVPWTEEEHKLFLMGLKKYGKGDWRN I SRNFVI TRTPTQVA 
SHAQKYF IRQLSGGKDKRRAS IHD ITTVNLEEEASLETNKS S I WGDQRS RLTAFPWNQT 
DNNGTQADAFNIT IGNAI SGVHS YGQVMIGGYNNADSCYDAQNTMFQIi* 
>G165 (19 -.699) 

CTTCAAAACATCTAAAAAATGGTGAAAAAAACTCTTGGTCGTAGAAAGGTAGAGATAGTG 
AAAATGACTAAGGAATCAAACCTTCAAGTCACATTTTCG^ 

AAGAAGGCTAGTGAATTTTGCACATTATGTGATGCAAAAATTGCGATGATCGTGTTTTCA 
CCAGCTGGAAAAGTATTTTCTTTTGGTCATCCAAATGTTGAT 

CGAGGGTGTGTTGTAGGACACAACAACACAAACCTTGATGAAAGCTACACAAAGCTTCAT 
GTTCAAATGCTCAACAAATCCTACACTGAGGTGAAGGCGGAAGTAGAAAAAGAACAAAAG 
AATAAGCAGTCGCGGGCTCAAAATGAAAGAGAAAACGAAAACGCTGAGGAGTGGTGGAGT 
AAGTCTCCATTAGAACTGAACTTAAGTCAATCAACCTGTATGATACGTGTTCTTAAAGAT 
TTGAAGAAGATAGTTGATGAAAAAGCAATTCAATTAATCCATCAAACAAACCCAAACTTC 
TATGTTGGAAGTTCTAGCAATGCTGCTGCTCCAGCAACTGTTAGTGGTGGTAATATCTCC 
ACAAACCAGGGGTTCTTTGATCAAAACGGAATGACGACTAATCCTACTCAAACACTTCTG 

TTTGGATITGATACTATGAATCGC^^ 

GTCTTGGTACTATAAGTTCATCTCTCTCGTTGTTGACTTTTTAAGTCTCCAATAGTTTGT 

TGTG . n ^. 

>G165 Amino Acid Sequence (conserved domain in AA coordxnates : 7-62) 
MVKKTLGRRKVEIVKMTKESOTjQVTFSKR 
SFGHPNVDVLLDHFRGCWGHNNTl^^ 

qnerei^aeewwsksplelnlsqstcmirvlkdlkkivdekaiqlihq™^ 
naaapatvsggnistnqgffdqngmttnptqtllfgfdimnrtpgv* 

>G1652 (77. .1078) 

AGCAAGTCCAAATCTCCCTCTCTCTCTCTCTATCTATCTCTCTATAGAAGATTTTTTAAC 

TAAGAAGCTAGCGATCATGGCCACAGCGATGAACGTTTTCTCTACCAAATGGTCCTCCGA 

ATTGGATATAGAAGAATATAGTATCATCCACCAATTCCACATGAACTCACTCGTCGGAGA 

TGTTCCACAGTCTCTCTCATCTCTTGATGATACCACCACTTGTTATAACCTTGATGCTTC 

TTGTAATAAAAGTTTGGTAGAAGAAAGACCTTCAAAGATCCTCAAGACCACTCACATATC 

ACCAAACTTACATCCTTTTTCTTCTTCTAATCCTCCTCCTCCAAAGCACCAGCCCTCTTC 

TAGGATTCTTTCTTTTGAAAAGACAGGTTTACATGTTATGAATCACAACTCTCCAAACTT 

AATATTTAGCCCCAAGGACGAAGAAATTGGATTACCAGAGCATAAGAAAGCCGAGCTGAT 

AATAAGAGGGACAAAGAGAGCTCAATCCTTGACTCGAAGCCAATCAAATGCTCAAGATCA 

CATACTGGCAGAGAGAAAACGGAGAGAGAAGCTTACTCAAAGATTTGTAGCTCTTTCCGC 

GCTAATTCCTGGCCTAAAGAAGATGGACAAGGCTTCTGTGTTGGGAGATGCAATAAAGCA 

TATAAAGTACCTCCAAGAGAGTGTGAAAGAGTATGAGGAACAAAAGAAGGAAAAGACAAT 

GGAATCAGTGGTTCTTGTAAAGAAGTCTAGTCTGGTTTTAGATGAAAATCATCAACCATC 

ATCATCATCTTCCTCAGATGGAAATCGCAATAGCTCGAGCTCAAATCTTCCAGAAATAGA 

AGTTAGGGTTTCAGGAAAAGATGTTCTTATTAAGATCCTATGCGAGAAGCAAAAGGGTAA 

TGTGATCAAGATTATGGGGGAGATTGAAAAGCTTGGTTTGTCTATCA^ 

CTTGCCCTTTGGACCCACTTTTGACATCTCTATTATCGCTCAGAAGAATAACAATTTTGA 

TATGAAAATCGAGGATGTTGTGAAGAACTTGAGTTTTGGCTTATCAAAGCTCACTTAATT 

GGTTTCACGTTACATACATATACACATTCATCATCGATTTCTCCGATCGAAGAATCCAAA 

ATCAGTTTTTCCATGAAAGTGGTTTTTTAGTTGTTAAGTTTGTTGTATGGAGATTCTTAA 

GTCATTTAAAGATCCTTGTTC TTGTGTTGTTAAGTGTGCTTTAAG ATG CATATCATCAAA 

TGTTTAGTAATTATTTCTCTCCAGTTTCATTTGGGACGGAATTTTTTTCGCAGTTGTTGG 

ATATATATTTCCTGCGATGTAAAGC^TTTCGTTAGTTTAATAAACGTCCGATATGTTTCT 

TTGAAAA 

>G1652 Amino Acid Sequence (domain in AA coordinates : 143 -215) 
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I^TAMfcTVFSTKWSSELDIEEYSIIHQFHMNS 

VEERPSKILKTTHISPNIjHPFSSSNPPPPKHQPSSRILSFEKTGLHVMNHNSPNIjIFSPK 
DEEIGLPEHKKAELIIRGTKRAQSLTOSQSNAQDHIIJ^RKRREKLTQRFVALSALIPGL 
KKMDKASVIiGDAIKHIKYIiQESVKEYEEQKKEKTMESVX^ 

DGNI^SSSSNLPEIEVRVSGKDVIjIKIIjCEKQKGNVIKIMGEIEKLGLSITNSNVLPFGP 

TFDISIIAQKNNNFDMKIEDVVKNLSFGLSKLT* 

>G1655 (132.. 755) 

TTTCTAACTAGTC^CATTGAGAGAGAGAGAGAGAGAAAGAGAGACTCTCAGAATCTGAAG 
AAGAAGAAGAGATTGTTGTTTTTGCCTTTTATCATCG6TTTCTTTC 

AATCGGATTTAATGGTGGAGTCTCTGTTCCCGAGCATCGAAAACACAGGTGAATCGTCTC 
GAAGAAAGAAGCCGAGGATATCAGAGACGGCGGAGGCGGAGATAGAGGCACGACGTGTCA 
ACGAAGAAAGCTTGAAGAGATGGAAAACGAATCGTGTGCAACAGATCTACGCTTGTAAGC 
TCGTCGAAGCTTTACGCCGAGTTCGTCAGAGATCTTCC^ 

ATAAACTCGTCTCCGGCGCGGCGAGGGAGATACGTGATACGGCGGATCGAGTTCTAGCTG 

CGTCCGCTCGTGGTACGACTCGGTGGAGCAGAGCGATTTTAGCGAGTCGCGTCCGAGCGA 

AGCTGAAGAAACATAGAAAGGCGAAAAAGTCAACGGGAAATTGTAAATCGAGAAAAGGTC 

TCACGGAGACGAATCGGATTAAGTTACCGGCGGTTGAGAGAAAACTGAAGATTCTTGGCC 

GTTTGGTTCCTGGTTGCCGGAAAGTCTCTGTACdGAATCTTTTAGATGAAGCGACCGATT 

ACATCGCAGCGTTAGAGATGCAGGTTCGAGCCATGGAGGCTCTCGCCGAACTTTTAACCG 

C^GCCGCACCACGGACGACGTTGACCGGAACTTAACGGCGGCAGTTAGTTTGTCAGTTGT 

TAATTAGCTTTTCTTTTACCTTTTTACCCCTTTA 

TCGTCGACGCGATTTTAATTTATTAAATTCA 

>G1655 Amino Acid Sequence (domain in AA coordinates: 134-192) 

MVESLFPSIEITTGESSRRKKPRISETAEAEIEARRVNEESLKRWKT^ 

LRRVRQRS STTSNNETDKL VS GAARE IRDTADRVIjAAS ARGTTRWSRAIIjASRVRAKLKK 

HRKAKKSTGNCKSRKGIjTETNRI KLPAVERKLKI LGRIjVPGCRKVS VPNIJjDEATDY IAA 

LEMQVRAMEALAELLTAAAPRTTIjTGT* 

>G1671 (188.. 751) 

tccc^ctatccttcgc^gacccttcctctatataaggaagttcatttcatttggagagg 
acacgctgacaagctgactctagcagatctggtaccgtcgaccctctctatataatcttc 
ttcta(^ca<^c^ca<^(^cgc^ 

gttagcaatgaatctaccaccgggatttaggttttttccgaccgatgaagagctcgtcg 
tcacttcctccaccggaaagcttccctcttgccttgtcaccctgatgtcatccccgacct 

TGATCTTTACCATTACGATCCTTGGGACCTTCCCGGGAAAGCTTTGGGAGAAGGGAGGCA 

ATGGTACTTCTATAGTAGAAAGACACAAGAGAGAGTGACAAGCAATGGGTATTGGGGATC 

AATGGGAATGGACGAGCCAATCTACACAAGCTCCACACACAAGAAAGTGGGAATCAAAAA 

GTATCTAACTTTCTATCTCGGAGATTCTCAGACTAATTGGATCATGCAAGAATATTCCCT 

CCCGGATTCCTCTTCTTCATCTAGTCGATCTTCTAAGAGATCAAGCCGTGCTTCTAGTTC 

TAGTC^CAAACCCGATTATAGCAAGTGGGTGATATGCAGAGTGTATGAGCAAAATTGCAG 

TGAGGAGGAAGACGATGATGGGACAGAACTCTCATGTTTGGATGAAGTGTTTTTGTCTTT 

AGATGATCTTGACGAAGTAAGCTTACCGTAATAAAGACAGAAGCACCCAAGAAGAGAAAA 

AAAAAAAAAGGGTTTAGTGGGCAATTATTTCTAAGCGACCGCTCTAGACAGGCCTAGTAC 

CGGATCCTCTAGCTAGAGCTTTCGTTCGTATCATCGGTTTCGACAACGTTCGTCAAGT 

>G1671 Amino Acid Sequence (domain in AA coordinates: TBD) 

Ml^PPGFRFFPTDEELVVHFIiHRKASL^ 

FYSRKTQERVTSNGYWGSMGITOEPIYTSSTHia^ 

SSSSSSRSSKRSSRASSSSHKPDYSKWVICRVYEQNCSEEEDDDGTELSCIiDEVFLSLDD 
LDEVSLP* r- 
>G1756 (71.. 1003) 

ATATGTACTTGTACAeCAACCCACC^U^AAGAGATAAAAGAGGAAACAAAAA 

AGAGAGATATATGGGTGAGGTGGCTTATATGGACGAAGGAGACCTAGAAGCAATAGTCAG 

AGGCTACTCCGGCTCCGGAGACGCGTTTTCCGGCGAAAGTTCCGGTACGTTTTCACCTTC 

GTTTTGCCTACCGATGGAGACGTCTAGTTTCTACGAACCGGAGATGGAGACAAGTGGCTT 

AGATGAGCTCGGTGAACTTTACAAACCCTTTTACCCTTTCTCCACACAAACGATCCTCAC 

AAGCTCGGTCTCTCTCCCTGAAGATTCAAAACCTTTCCGAGATGACAAGAAACAACGATC 

ACATGGTTGTCTTTTATCCAACGGATCAAGAGCTGATCATATCCGAATTTCAGAATCCAA 

ATCAAAGAAAAGCAAGAAGAATCAACAGAAGAGAGTTGTTGAGCAAGTGAAAGAAGAGAA 
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TCTGTTGTCGGACGCATGGGCGTGGCGTAAATACGGGCAGAAACCCATCAAAGGATCTCC 
ATACCCAAGGAGTTATTACAGATGCAGTAGCTCAAAAGGGTC 

CGAAAGAAATCCTCAAAACCCGGAGAAATTCACCATAACATACACTAATGAGCACAATCA 
TGAACTACCAACCCGGAGAAACTCATTAGCCGGTTCGACTCGAGCAAAAACTTCCCAACC 
CAAACCAACCTTAACCAAAAAATCCGAAAAAGAAGTTGTTTCTTCCCCTACAAGTAATCC 
TATGATCCC^TCCGCTGATGAATCTTCTGTTGCGGTTCAAGAAATGAGCXSTTGCGGAAAC 
GAGTACGCACCAAGCGGCTGGAGCAATCGAGGGCCGCCGCTTGAGTAACGGTTTACCATC 
GGATTTGATGTCCGGGAGCGGAACTTTTCCAAGTTTTACCGGTGACTTCGATGAACTATT 
GAATAG C CAAGAGTTCTTC^GTGGGTATTTATGGAATTACTAGAGAGCATTAGGTGTATG 
TATATATATAT 

>G1756 Amino Acid Sequence (domain in AA coordinates: TBD) 
MGEVAYMDEGDLEAIVRGYSGSGDAFSGESSGTFSPSFCIiPMETSSFYEPEMETSGLDEIi 
GEI1YKPFYPFSTQTII1TSSVSLPEDSKPFRDDKKQRSHGCI1LSNGSRADHIRISESKSKK 
S KKNQQKRVVEQVKEENLLSDAWAWRKYGQKP I KGS P YPRS YYRCS S SKGCLARKQVERN 
PQNPEKFTITYTNEHNHELPTRI^SIiAGSTRAKTSQPKPTIiT 

SADESSVAVQEMSVAETSTHQAAGAIEGRRLSNGIjPSDLMSGSGTFPSFTGDFDELIjNSQ 

EFFSGYLWNY* 
>G1757 (250. .1224) 

ATCACCAATCCTATAACACTCTCATTCTCATCATATCATTCTTCAATCTATATAACCCAT 
TCTTAATTATACTGAACACACATTATAT^^ 

ATATAACCAATTCTTGATTTATACTTAAAACACACATTATACATCTTTCTCA.TCATAGTT 

TGTATCAATTTCCTAGAGTAAACTACCTAAAGGAAAAAAAAAATCTATTTTGGGAATCAT 

ATACTAAAAATGGAAGGAAGAGATATGTTAAGTTGGGAGCAAAAGACATTGCTAAGCGAG 

CTTATCAATGGATTTGATGCGGCCAAAAAGCTTCAGGCACGACTTAGAGAAGCTCCGTCG 

CCGTCGTCATCATTTTCATCACCGGCGACGGCTGTTGCTGAGACTAACGAGATTCTGGTG 

AAGCAGATAGTTTCTTCCTACGAGAGATCTCTTCTTCTGCTAAACTGGTCATCCTCACCG 

AGCGTACAACTTATTCCGACGCCGGTTACTGTAGTCCCGGTGGCAAATCCCGGCAGTGTT 

CCAGAATCTCCGGCATCGATAAACGGAAGTCCGAGAAGTGAAGAGTTTGCCGATGGAGGA 

GGTTCTAGCGAGAGTCATC^TCGCC^GATTAC^TTTTCAATTCAAAGAAAAGAAAGATG 

TTACCAAAGTGGTGAGAAAAAGTGAGAATAAGCCCAGAGAGAGGCTTAGAAGGACCTCAA 

GATGATGTCTTTAGCTGGAGAAAATATGGTCAAAAAGACATTTTAGGCGCCAAATTCCCA 

AGGAGTTATTACAGATGCACACATCGTAGCACACAAAACTGTTGGGCAACGAAACA 

CAGAGATCAGACGGGGATGCTACGGTTTTCGAAGTGACGTACAGAGGAACACACACTTGT 

TCGCAGGCGATCACAAGAACACCACCATTAGCCTCGCCGGAGAAGCGACAAGACACCAGA 

GTCAAACCAGCCATTACCCAAAAGCCAAAGGATATTCTCGAGAGTCTTAAATCCAACTTA 

ACCGTTCGAACCGATGGGCTTGATGATGGTAAAGACGTTTTCTCGTTCCCTGATACGCCG 

CCGTTTTACAATTACGGAACTATCAACGGCGAGTTCGGCCACGTGGAGAGTTCTCCGATC 

TTCGACGTTGTTGACTGGTTCAATCCAACGGTCGAGATTGACACAACTTTCCCCGCGTTT 

TTACACGAGTCGATTTATTATTAATTAAAATTTGTAACAGAGAAATAGATAGTAACTAGT 

AAGTAATGATCAGCGAGAGTTAAAACATAAAAGTACTTAGAGTAATCTAACGATGCATAA 

TAAGGAATGTTCAACAGGACTTGAACATGATTTCAATACTAAGAGAGATTTATCTAGCTA 

CTGGTAGTAGCCGGAGACTTCTTGTTGTAGCTTCAOT 

>G1757 Amino Acid Sequence (domain in AA coordinates: 158-218) 
MEGRDMLSWEQKTLIj S EIj INGFDAAKKLQARLREAPS PS S S FS SPATAVAETNE I LVKQ I 
VSSYERSIiLLIiNWSSSPSVQLIPTPVTVVPVANPGSVPESPASINGSPRSEEFADGGGSS 
ESHHRQDYI FNSKKRKMLPKWSEKVRI S PERGLEGPQDDVFS WRKYGQKDILGAKFPRSY 
YRCTHRSTQNCWATKQVQRSDGDATVFEVTYRGTHTC S QAITRTPPLASPEKRQDTRVKP 
AITQKPKDILESLI^NLTVRTDGLDDGKDW 
VDWFNPTVEIDTTFPAFLHESIYY* 
>G1782 (1. .927) • . 

ATGCAAGTGTTTCAAAGGAAAGAAGATTCATCTTGGGGAAACTCAATGCCTACAACAAAT 

TCAAATATTCAAGGATCTGAATCTTTCAGCTTGACTAAGGATATGATAATGTCTACAACA 

CAATTACCCGCGATGAAACATTCGGGTTTGCAGCTGCAAAATCAAGATTCAA 

CAATCTACTGAAGAAGAATCAGGCGGCGGTGAAGTTGCAAGCTTTGGAGAATATAAGCGT 

TATGGATGCAGCATTGTTAATAACAATCTCTCAGGTTACATCGAAAACTTGGGAAAGCCT 

ATTGAAAATTATACTAAGTCAATTACTACCTrCGTCGATGGTGTCTCAAGACTCTGTGTTT 

CCTGCTCCTACTTCTGGTCAAATATCTTGGTCTCTTCAATGTGCTGAAACGTCACATTTC 
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AATGGTTTCTTGGCTCCTGAATATGCATCAACACCAACGGCGCTGCCACATTTAGAGATG 
ATGGGTTTGGTTTCTTCAAGAGTGCCATTGCCTCATCACATTCAAGAGAATGAACCAATA 
TTTGTCAATGCGAAACAGTATCATGCGATTCTCCGTCGCAGGAAGCACCGTGCTAAACTC 
GAAGCTCAGAACAAACTCATCAAATGCCGTAAACCGTACCTTCATGAGTCTCGCCATCTT 
CATGCTTTAAAGAGAGCTAGAGGCTCCGGTGGACGTTTCCTC^^TACAAAGAAGCTTCT^A 
GAATCATCAAACTCACTGTGTTCTTCTCAAATC^ 

CCTC^CGGTGGTGGAAGCGGAATCGGGTCTAGTTCGATCTCACCGAGCTCCAATTCAAAC 

TCTATCAAGATGTTCCAAAACCCGCAGTTCAGATT^ 

GCCTCAGCTCTCATGTCAGGGACTTGA 

>G1782 Amino Acid Sequence (domain in AA coordinates: 166-238) 

MQWQRKEDSSWGNSMPTTNSNIQGSESFSLTKDMIMSTTQIiPAMKHSGIiQIiQNQDSTSS 

QSTEEESGGGEVASFGEYKRYGCSIVlsntfl^ 

PAPTSGQISWSLQCAETSHFNGFLAPEYASTPTALPHXiEMMGLVSSRVPLPHHIQENEPI 
FVNAKQYHAILRRRKHRAKLEAQNKLIKCRKPYLHESRHLI^ 

ESSNSLCSSQMANGQ1TFSMSPHGGGSGIGSSSISPSSNSNCINMFQNPQFRFSGYPSTHH 

ASALMSGT* 

>G184 (327.. 1937) 

TGAATTCTAGCCTTTTTGTAGGCGAATCATCTGGACCGGTAAGAGACTCTCTCATCGATA 

ATAACCACATAATTTAATCAAACTCTTTCTCTCTCTTTCTAAGATCTTTTGCTTTGCTC 

TTTCCTTTTTGATCTTCCTATATATGGAGAAGCACCAAAACGGTACTTACTATACGATAC 

TGTACGGATCCATCAAACTGGATTAATTATCAAAACGTACATTTTTATCTTACCTGGCAA 

GTTACATTCCTAGGGTTTTGGAGAATCCAATCAACAACAAAGAAAATAATCATCGTTACA 

ATAATCAGTATCACGCACAGACTTAGATGTTCCGGTTTCCAGTGAGTCTAGGCGGTTCAC 

GTGACGAAGACCGTCACGATCAGATCACACCGTTGGATGACCATCGTGTGGTGGTTGATG 

AGGTTGACTTCTTCTCAGAGAAGAGAGATAGGGTTTCACGTGAGAACATCAACGACGACG 

ACGACGAAGGCAATAAGGTTCTC^TCAAAATGGAGGGTTCACGAGTTGAAGAAAACGATG 

GTTCGAGAGATGTCAATATCGGTCTGAATCTTCTGACCGCGAATACGGGAAGCGATGAGT 

CAACGGTGGATGATGGACTATCAATGGATATGGAAGATAAACGTGCAAAGATTGAGAACG 

CACAACTACAAGAAGAGCTCAAGAAGATGAAAATAGAGAATCAAAGGCTAAGAGATATGT 

TGAGCCAAGCGACGACCAACTTCAATGCCTTACAAATGCAACTTGTTGCCGTCATGAGG 

AACAAGAACAACGTAACTCTTCACAAGATCATCTCCTGGAGAGCAAAGCAGAAGGAAGGA 

AACGGCAGGAACTGCAAATGATGGTGCCAAGGCAGTTCATGGACCTTGGGCCGTCGTCTG 

GAGCAGCAGAGCATGGAGCCGAAGTGTCATCTGAAGAGAGGACAACGGTTCGTTCAGGTT 

CTCCTCCTTCGCTTCTAGAAAGTTCCAATCCCCGAGAGAACGGAAAGAGGTTGCTTGGAA 

GAGAAGAAAGCTCAGAGGAATCAGAGTCTAACGCCTGGGGAAACCCTAACAAAGTCCCCA 

AACATAATCCATCCTCTAGCAATAGCAATGGAAACAGAAACGGAAATGTTATTGATCAGT 

CGGCCGCAGAAGCCACCATGCGGAAAGCCCGTGTCTCAGTTCGTGCCCGATCTGAAGCTG 

CCATGATAAGCGATGGATGTCAATGGAGAAAGTACGGACAAAAAATGGCTAAAGGAAACC 

CGTGTCCGCGGGCTTATTATCGTTGCACAATGGCCGGTGGATGTCCAGTTCGCAAGCAAG 

TGCAGCGTTGCGGAGAAGACAGATCTATTCTCATAACCACCTACGAAGGAAACCACAACC 

ATCCACTCCCAC(^GCCGCTACGGCC^TGGCCTC^C^CCACCGCAGCTGCAAGCATGC 

TCCTCTCGGGCTCAATGTCGAGTCAAGACGGTTTAATGAACCCAACAAACCTCCTAGCTC 

GAGCTATCTTGCCTTGCTCCTCAAGGATGGCTAC?^ 

CCATCACATTGGACCTCACCAATTCACCCAACGGTAACAACCCTAATATGACCACTAATA 

ACCCGTTGATGCAGTTCGCTCAACGGCCCGGTTTCAACCCGGCAGTTTTGCCTCAAGTGG 

TTGGTCAAGCTATGTACAATAACCAACAACAGTCCAAGTTTTCTGGTTTACA 

CTCAGCCACTGCAGATCGCGGCCACTTCCTCGGTGGCCGAGAGCGTTAGTGCTGCCAGTG 

CAGCAATTGCGTCGGATCCAAACTTTGCGGCGGCTCTAGCGGCAGCGATCACGTCCATTA 

TGAACGGTTCCAGTCATCAAAATAATAACACCAATAATAATAATGTGGCTACGAGCAACA 

ATGACAGTAGGCAATAAGAGTTTTCATTTTGATGGTCGATTTTTTTTTTTGGGG 

>G184 Amino Acid Sequence (domain in AA coordinates: 295-352) 

MFRFPVSLGGSRDEDRHDQITPLDDHRVVVDEVDFFSEKRDRVSRENINBDDDEGNKVLI 

KMEGSRVEE1TORSRDVNIGLNLLTANTGSDESTVDDGLS 

MKIENQRLRDMLSQATTNFNAIiQMQLVAVMRQQEQRNSSQDHI^ 

PRQFMDLGPSSGAAEHGAEVSSEERTTVRSGSPPSLLESSNPRENGKRIiIiGREESSEESE 
SNAWGNPNKVPKHNP S S SNSNGNRNGNVIDQSAAEATMRKARVS VRARSEAAMI SDGCQW 
RKYGQKMAKGNPCPRAYYRCTMAGGCPVRKQVQRCAEDRS I L ITTYEGNHNHPLPPAATA 
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MASTTTAAASMLLSGSMSSQDGIiMOTTNLIiARAIL 
PNGNNPNMTTNOTLMQFMRPGra 

SSVAESVSAASAAIASDPNFAAALAAAITSI^GSSHQNITO^ 
>G1845 (111-. 989) 

AAGACATAATTTTCTCTGTTTTCCTAGCTCTCTCCTCTCAAATTCTTCCATTGCTCTCTG 
TTTTGGCAAATCGTGAACTGCGACGTCTTTAAGGCAT 

ACGAGGAGCTAAATCTTTGTATTACGAAAGGTAAAAATGTTGATCATTCTTTTGGAGGAG 
AAGCTTCTTCCACGTCCCCAAGATCTATGAAGAAAATGAAGAGTCCTAGTCGTCCTAAAC 

CCTATTTCCAATCCTCTTCTTCTCCTT^ 

CAACACTTGAGAATCAGCAACAACAACTCGGATCAT^ 

AAGACCCGACAATGCAAGGCCAGAAGGAAATGATCTCCTTTAGTC 

AGCAGCAGCAGTATATGGCCCAGTACTGGAGTGACACATTGAATCTGAGTCCAAGAGGAA 
GAATGATGATGATGATGAGCCAAGAAGCTGTTCAACCTTACATCGCAACGAAGCTGTACA 
GAGGAGTGAGACAACGTCAATGGGGAAAATGGGTCGCAGAGATCCGTAAGCCACGAAGCA 
GGGCACGTCTTTGGCTTGGTACCTTTGATACAGCTGAAGAAGCTGCCATGGCCTACGACC 
GCCAAGCCTTCAAATTACGAGGCCACAGCGCAACACTGAATTTC 

ATAAGGAAAGCGAGCTGCATGATTCAAACTCGTCGGATCAGAAAGAACCTGAAACGCCAC 

AGCCT^GCGAGGTTAACTTGGAGAGCAAGGAACTACCGGTGATTGATGTTGGGAGAGAGG 

AAGGTATGGCTGAGGCATGGTACAATGCCATTACATCGGGATGGGGTCCTGAAAGTCCTC 

TTTGGGATGATTTGGATAGTTCTCATCAGTTTTCATGAGAAAGCTCATCTI^ 

TCTCTTGTCCTATGAGGCCTTTCTTTTGAAAAAGTTTATAAACCCACATTGTGTTGTAGG 

TTATAGTTTAGGGTTATGCTCATTGGCATTTGGATGGA 

TCCACCACATATCAGTCATTATATGTGTCTACCT^ 

TGTTTTTATTATGTGTCTGTATGTGTTTCCCTATTGCTACATACATAGATGTCCTCTTTG 
TTCAAAAAAAAAAAAAAAAAAAAAAA 

>G1845 Amino Acid Sequence (domain in AA coordinates: 140-207) 
MDFDEELNLCITKGKl^HSFGGEASSTSPRSMKK^ 

SLDPTLQNQQQQLGSYVPVLEQRQDPTMQGQKQMISFSPQQQQQQQQYMAQYWSDTimiS 
PRGRMMM^SQEAVQPYIATKLYRGWQRQWGK^ 

AYDRQAFKLRGHSATLNFPEHFVITKESELHDSNSSDQKEPETPQPSEVN^ 
GREEGMAEAWYNAITSGWGPE S PLWDDLDS SHQFS SES S S S S PLS CPMRPFF * 
>G1879 (3.. 917) 

AAATGCCCTTAGAGGCTGTCGTATACCCGCAAGATCCATTCGGATATCTCTCCAATTGCA 
AAGATTTTATGTTCCACGACTTATACTCTCAAGAAGAGTTCGTAGCTCAAGATACGAAGA 
ACAACATTGATAAGTTAGGGCATGAACAGAGCTTTGTGGAACAAGGTAAGGAGGACGATC 
ATCAATGGCGAGACTATCATCAGTATCCTTTGTTGATCCCTTCGTTGGGAGAAGAGCTTG 
GTCTTACCGCCATTGATGTGGAGAGTCATCCTCCTCCACAGCACCGGAGGAAGAGGAGGA 
GAACGAGAAACTGCAAGAACAAGGAAGAGATCGAGAACCAGAGAATGACTCACATCGCCG 
TCGAGAGAAATCGCCGGAAACAGATGAACGAGTATCTGGCTGTGCTCCGTTCTCTAATGC 
CGTCGTCGTATGCTCAAAGAGGAGATCAAGCGTCGATAGTAGGAGGAGCTATAAACTACG 
TGAAGGAGTTAGAGCATATTTTACAATCTATGGAGCCGAAGAGAACTAGGACTCATGATC 
CCAAAGGAGACAAGACTAGCACTAGCTCGTTAGTGGGTCCATTCACAGATTTTTTCAGCT 
TCCCACAATATTCTACAAAGTCATCATCAGATGTACCGGAAAGCTCATCTTCACCGGCGG 
AGATAGAGGTTACGGTGGCAGAAAGCCATGCGAACATCAAGATAATGACGAAGAAGAAAC 
CGAGGCAGCTTCTTAAGCTCATAACTTCTTTACAAAGCCTAAGGCTCACrrCTTCTTCATC 
TCAATGTCACCACTCTCCACAACTCCATTCTCTACTCCATCAGCGTCAGGGTTGAAGAAG 
GAAGCCAACTGAATACCGTGGACGACATTGCAACAGCTTTGAATCAAACCATAAGGAGGA 
TTCAAGAAGAGACATAATTCAGCAAATAGATTATAATTAACTTGTTTTATTTTTATTTTA 
TTTTGAAATAACTGAAATCAGTTTTCTAATTTTTTTTTTTTTTCACTATTCCTCTAATCC 
TCCCTATGTAAGTTGCATTTTTGTCTCTTGTAATGAATCAATGGTCATAAAGATCTGAAC 
AAAAAAATTGAATAAAAGAAAATGGTT 

>G1879 Amino Acid Sequence (domain in AA coordinates: 107-176) 
MPLEAVVYPQDPFGYLSNCKDFMFHDLYSQEEFVAQDTKI^IDKIjGHEQSFVEQGKEDDH 
QWRDYHQYPLIilPSLGEELGIiTAJlDVESHPPPQHRRKRRRTRNCKNKEEIENQRMTHIAV 
ERNRRKQMNEYLAVLRSLMPSSYAQRGDQASIVGGAINY^ 

KGDKTSTSSLVGPFTDFFSFPQYSTKSSSDVPESSSSPAEIEVTVAESHANIKIMTKKKP 
RQLLKLITSLQSLRLTLIjHIjNVTTLHNS 
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QEET* 

>G1888 (1..729) 

ATGAAGATTTGGTGTGCTGTTTGTGATAAAGAAGAAGCTTCGGTGTTTTGTTGTGCGGAT 

GAAGCAGCTCTTTGTAATGGTTGCGATCGCCATGTTCATTTCGCCAATAAACTAGCCGGG 

AAACATC^CCGGTTCTCTCTCACTTCTCCTACTTTCAAAGATGCTCCTCTTTGTGATATT 

TGCGGGGAGAGGCGTGCATTATTATTTTGCCAAGAAGACAGAGCAATACTATGCAGAGAA 

TGTGACATTCCAATACATCAAGCTAATGAGCACACTAAGAAACACAATAGAT^ 

ACCGGCGTTAAGATCTCTGCCTCCCCGTCAGCCTACCC^GAGCCrCCAATTCCAACTCT 

GCTGCTGCATTTGGTCGAGCCAA^CCCGACCAAAATCAGTATCGAGCGAGGTCCCGAGC 

TCGGCCTCC^U^TGAGGTATTTACGAGCTCTTCTTCGACGACCACGAGCAATTGCTATTAT 

GGGATAGAAGAAAACTACCATCACGTGAGCGATTCGGGGTCGGGATCGGGTTGTACAGGT 

AGTATATCCGAGTATTTGATGGAGACATTACCGGGTTGGAGAGTGGAGGATTTGCTTGAA 

CACCCTTCTTGTGTCTCCTATGAGGATAACATTATTACTAATAACAATAACAGTGAGTCT 

TATAGGGTTTATGATGGTTCTTCACAATTCCATGATCAAGGGTTTTGGGATCACA 

TTCTCTTGA 

>G1888 Amino Acid Sequence (domain in aa coordinates: 5-50) 
MKIWCAVCDKEEASVFCCADEAALCNGCD^ 

CGERRALLFCQEDRAILCRECDIPIHQANEHTKKHNRFLLTGVKISASPSAYPRASNSNS 
AAAFGRAKTRPKS VS SE VPS S ASNEVFTS S S STTTSNCYYG IEENYHHVSDSGSGSGCTG 
S ISEYLMETLPGWRVEDIiLEHPSCVSYEDNI ITNHNNSESYRVYDGSSQFHHQGFWDHKP 
FS* 

>G189 (34.. 987) 

CCACAACTCTCTCCTTGTAGAGAGAGAGATTTTATGGCGGTGGAGCTCATGACTCGGAAT 
TACATCTCCGGCGTCGGAGCTGATAGCTTCGCCGTTCAAGAAGCAGCTGCTTCAGGACTC 
AAAAGTATCGAAAATTTCATCGGTTTAATGTCTCGTGATAGCTTTAACTCTGATCAGCCA 
TCTTCTTCTTCCGCCTCCGCCTCCGCCTCCGCCGCCGCAGATCTTGAATCAGCTCGTAAC 
ACAACGGCGGACGCGGCTGTTTCAAAGTTTAAAAGAGTCATATCTCTCTTAGATCGAACT 
CGAACCGGAGACGCCCGGTTTAGACGTGCTCCGGTTCATGTTATTTCTCCGGTTCTTTTA 
CAAGAAGAACCAAAAACGACGCCGTTTCAGTCTCCTCTTCCTCCTCCGCCGCAAATGATC 
CGAAAAGGTTCGTTTTCTT.CATCGATGAAAACGATTGATTTCTCATCTCTCTCCTCTGTA 
ACAACGGAATCAGACAACCAGAAGAAGATTCATCATCATCAACGTCCCTCTGAAACGGCG 
CCGTTTGCGTCTCAAACTCAAAGCCTCTCCACGACGGTCTCGTCTTTCTCAAAATCAACA 
AAGAGAAAATGTAACTCTGAGAATCTTCTCACCGGAAAATGCGCTTCCGCTTCTTCCTCC 
GGTCGTTGTCATTGCTCGAAGAAAAGAAAGATAAAACAGAGGAGAATAATTAGGGTTCCG 
GCGATAAGTGCAAAAATGTCCGATGTACCACCGGACGATTATTCATGGAGGAAATACGGA 
CAAAAACCAATTAAAGGATCTCCACATCCAAGAGGATATTATAAGTGTAGTAGCGTAAGA 
GGTTGTCCAGCACGTAAACATGTTGAGAGAGCAGCTGATGATTCGTCCATGTTGATTGTT 
ACTTATGAAGGAGATCATAATCATTCTCTCTCCGCCGCTGATCTCGCCGGAGCCGCCGTT 
GCTGATCTTATTTTGGAATCGTCTTGAAAAGAACAAATCTTTATTTAAGGCTTTTATAAT 
ATAAATTTAGATCCTTACTTAGTGAAGTACTCAAACTATGAATGAAATCAATGTAATCAA 
AATCAAAAAGCTTTTGCTAAAAAAAAAAAAAAAAA 

>G189 Amino Acid Sequence (domain in AA coordinates: 240-297) 
MAVEIiMTRNYISGVGADSFAVQEAAASGLICS IENFIGLMSRDS FNSDQPSS S S AS AS ASA 
AADLE S ARNTTADAAVS KFKRVI S LLDRTRTGHARFRRAPVHVT S PVLLQEE PKTTPFQS 
PLPPPPQMIRKGSFSSSMKTIDFSSLSSVTTESDNQKKIHHHQRPSETAPFASQTQSLST 
TVSSFSKSTKRKCNSENIiLTGKC^ASSSGRCHCSKKRKIKQRRIIRVPAISAKMSDVPP 
DDYSWRKYGQKPIKGSPHPRGYYKCSSVRGCPARKHVERAADDSSMLIVTYEGDHNHSLS 
AADLAGAAVADLI LES S * 
>GX939 (92.. 844) 

AATCATTAGCTTCTTeTCTTCTCTTCTCTCACAGAGAGAGTAATCACAAGCCAAGTGAGA 

AAAAG AAAAC ACTAAAC C C AG AT CGAAAACCATGTCT ATT AACAAC AAC AACAACAACAA 

CAACAATAACAACGATGGTCITATGATCTCATCAAACGGAGCTTTAATCGAAC^ 

ATCAGTCGTTGTGAAGAAACCACCGGCGAAAGATCGACATAGCAAAGTCGATGGAAGAGG 

GAGAAGAATCCGTATGCCGATTATATGTGCTGCrraSTGTTTTTCAGCTAACGAGAGAGCT 

TGGTCATAAGTCAGATGGCCAAACAATTGAATGGTTACTTCGTCAAGCAGAGCCTTCTAT 

TATAGCTGGAACAGGAACTGGTACAACTCCAGCGAGTTTCTCAACTGCTTCTGTCTCTAT 

CCGTGGAGCCACCAATTCTACTTCTTTAGATCATAAACCCACTTCTTTACTTGGTGGTAC 



167 



WO 03/013227 PCT/US02/25805 

168/286 



GTCACCGTTTATACTTGGGAAACGTGTTAGAGCTGATGAGGATAGTAATAATAGTCATAA 
TCATAGTTCTGTTGGTAAAGATGAGACCTTO 

TCCGGCGAGGCCGGATTTTGGACAAGTTTGGAGTTTTGCTGGAGCTCCACAAGAGATGTT 
TTTACAACAACAACATCATCATGAGCAACC^ 

AGCTGCAATGGGTGAAGCTTCTGCTGCTAGAGTTGGGAATTATCTTCCGGGTCATCTTAA 
TTTGCTTGCTTCTTTATCCGGTGGATCTCCCGGGTCGGATCGAAGAGAGGAAGATCCACG 
TTAATGGTTTAAGCCCTTTTAGGTTTGAGGGCAA^TTTGGTATATATATTTATTATCTT 
CTCTTCTCTATTGTTGTCATTGTTTCTCTATGTGTGTGTTTTAGTGTTGTTAGAGATTGA 
TTTGGTTTCAGAATCTCTGCAAGTGATTTGAGAGTTTTCGTTAGCTTTAAGTAAGTTAAA 
GACGGTTGTTTTTGATTAGGGTTAAATTAGGGTTTAAGA 

AGATCGATTTCTTATCGGATCCAAGATTACTTTTAGGAAAAAAGGGAAAATTTCAGAAAC 
C^CGGTGGTTTCTTTTCCTCTTTTTTTTTTTG 

>G1939 Amino Acid Sequence (domain in AA coordinates: 40-102) 
MSIITNNNNNNNNNNDGLMI^ 

ARVFQLTRELGHKSDGQT I EWLLRQAEPS 1 IAATGTGTTPAS FSTASVS I RGATNSTSIiD 

HKPTSLLGGTSPFILGKRVRADEDSimSHNHSSVGKDETFTTTPAGFWAVPARPDFGQW 

SFAGAPQEMFLQQQHHHQQPLFVHQQQQQQAAMGEASAARVGNYL^ 

GSDRREEDPR* 

>G194 (192.. 1205) 

TCTTTCTTCTCTCTCTATCTCTCCTCTTTGAACCCTAAAAACTCTTTCTTTACAAGGATT 
GATCTTTTTGTATTTTTGATTTTGACATTT^ 

TTTCTCTGTTTTTAAAGCCATTTGATAGATTGTTTCCGGTAAAGCTCAGCGAGAGAAGAA 
GAAGAACAACAATGGAGTTTACAGATTTCTCAAAGACGAGTTTTTACTACCCGTCGTCAC 
AAAGCGTTTGGGATTTCGGAGATTTAGCGGCGGCGGAGAGGCATTCTTTAGGGTTCATGG 
AGTTATTAAGTTCTCAGCAGCATCAAGACTTTGCTACTGTTTCTCCTCATTCCTTCCTTC 
TCCAAACGTCTCAACCGCATU^CGCAAACGCAACCATCGGCGAAGCTGTCTTCAAGTATCA 
TTCAAGCTCCACCGTCAGAGCAATTAGTGACGTCAAAGGTGGAGTCTTTGTGTTCGGATC 
ATTTGTTGATAAACCCACCGGCGACTCCTAACTCGTCA.TCGATTTCGTCTGCTTCAAGCG 
AGGCTCTAAATGAAGAGAAACCGAAAACAGAAGACAATGAAGAAGAAGGAGGTGAAGATC 
AACAAGAGAAGAGTCATACTAAGAAACAGTTGAAAGCAAAGAAGAATAATCAGAAGAGAC 
AGAGAGAGGCAAGAGTCGCATTCATGACAAAGAGTGAAGTTGATCATCTCGAAGATGGTT 
ATCGCTGGCGAAAATATGGTCAAAAAGCTGTGAZy\AACAGTCCTTTTCCCZAGGAGTTACT 
ACCGTTGCACAACGGCTTCATGTAACGTGAAGAAGAGAGTGGAGAGATCATTCAGAGATC 
CAAGCACTGTGGTTACAACCTACGAAGGTCAACACACTCACATTAGTCCACTCACGTCTC 
GTCCTATTTCCACTGGAGGTTTCTTCGGATCGTCAGGAGCTGCTTCGAGTCTCGGTAATG 
GTTGCTTTGGGTTTCCTATTGATGGCTCCACGTTAATCTCTCCTCAGTTCCAACAGCTTG 
TCCAATACCATCACCAACAGCAGCAACAAGAACTCATGTCTTGTTTTGGAGGAGTCAACG 
AGTACCTTAATAGCCACGCTAATGAGTATGGTGATGATAATCGTGTGAAGAAGAGTCGAG 
TTTTGGTTAAAGATAATGGACTTCTGCAAGATGTTGTTCCGTCTCATATGTTGAAGGAAG 

AAAGAAACGGATCTTTTGTTCTGATGAAGAAGATGTTTTCTTATGGTTCTGAAATCGTAA 
GGTAATGATGATTGTACCAAGCCGAGAAAGTACTTGTGATTTTCACCATTGAATCACTAT 
AAATGTAATTTTTATTTACTGTGAAAAAAAAAAAAAAA 

>G194 Amino Acid Sequence (domain in AA coordinates: 174-230) 

^IEFTDFSKTSFYYPSSQSVWDFGDLAAAERHSLGFMEIlLSSQQHQDFATVSPHSFLLQTS 

QPQTQTQPSAKLSSSIIQAPPSEQLVTSKVESLCSDHIiLINPPATPNSSSISSASSEALN 

EEKPKTEDNEEEGGEDQQEKSHTKKQLKAIOa^QKRQREARVAFMTKSEVDHLEDGYRWR 

KYGQKAVKNSPFPRSYYRCTTASC]WKKRVERSFRDPSTWTTYEGQHTHISPL^ 

TGGFFGSSGAASSLGNGCFGFPIDGSTIilSPQFQQLVQYHHQQQQQELMSCFGGVNEYIiN 

SHANEYGDDNRVKKSRVIiVKDNGLLQDWPSHMLKEE * 

>G1943 (137. .1858) 

ACATTTGTTTCTAATCTCAGACATAAATAATT^^ 

ATTATATCATTCCACATTCATTTTCTTCTACTTCTTCCTTCTCCTTGATCTCATTTCCCT 
AGAAAATCCATCTATCATGGGTGAAGATGATATAGTGGAGCTCTTATGGAAGAGTGGCCA 
AGTCGTTAGAACCAGTC^U^CACAGAGACCCTCCTC 

ACCACCCATTCTTCGTGGTAGCGGAAGCGGCAACGGAGAAGAAAATGCCCCGCTTCCACT 
TCCACAGCCTTCACCTCCCCTCCATCATC^GAATCTTTTCATTCTGGAAGACGAAATGTC 
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TTCTTGGCTTCACCATTCTCACCCCGGCGTTACGTCCACCCCGGCTTCTTCTGTCTCCCT 
GCCACCACCACCCAATGCTCCGCGTGAAGATGATATAGTGGAGCTTTTATGGCAAAGCGG 
CCAAGTAGTTGGAACCAACCAAACTVCATAGACAATCCTACGATCCrCCTCCCATTCTCCG 
CGGCAGCGGAAGTGGCAGAGGAGAAGAAAATGCTCCCCTTTCACAACCTCCGCCTCACCT 
GCATCAGCAAAATCTCTTCATTCTU^GAAGGCGAAATGTATTCGTGGCTAC^CCATTCTTA 
CCGCCAAAACTATTTCTGCTCAGAACTTCTCAACTC 

TTCCATCTCTCTGGCACC^CGTCAGACTATCGCCACGAGAAGGGCGGAAAACTTTATGAA 

CTTCTCGTGGCTAAGAGGGAACATATTTACCGGCGGTAGAGTTGATGAAGCTGGACCGTC 

GTTTTCGGTGGTAAGAGAATCGATGCAGGTAGGCTCGAACACGACCCCCCCTTCTTCTTC 

TGCCACTGAATCATGTGTAATACCAGCTACAGAGGGCACCGCGAGTCGAGTGTCGGGAAC 

TTTGGCAGCTCATGATCTTGGTCGGAAGGGAAAGGCGGTGGCGGTTGAGGCGGCCGGAAC 

ACCATCTTCAGGAGTGTGCAAGGCCGAAACAGAGCCGGTTCAGATACAACCAGCAACGGA 

GTCGAAGCTAAAAGCGAGAGAAGAAACCCATGGAACTGAAGAAGCTCGTGGTTCAACGTC 

TAGAAAGAGATCACGAACTGCAGAAATGCATAACCTCGCCGAAAGGAGAAGGAGAGAAAA 

GATCAACGAGAAGATGAAGACTCTGCAACAACTCATTCCTCGCTGCAACAAGGTTGA 

TGATTCTGTTTCTACTCTGATCAGTCTACTAAAGTTTCAACGCTGGATGATGCTATCGAG 

TACGTCAAATCGTTAGAGAGCCAAATACAAGTATGCTC^ 

ACCAATGGTTCAACATGGAAAGAGTTCATATGTATCTAGTTTXGTTGAGATGATGTCGAC 

GGGACAGGGTATGATGTCGCCAATGATGAATGCCGGGAATACGCAACAGTTCATGCCCCA 

TATGGCCATGGATATGAACCGACCTCCTCCATTCATACCrTTCCCCGGCAC^ 

TATGCCGGCTCAAATGGCAGGTGTAGGTCCATCATATCCAGCACCGCGCTACCCTTTTCC 

CAACATTCAGACCTTTGACCCATCCAGAGTCCGTTTACCAAGCCCGCAGCCTAACCCGGT 

GTCGAACCAGCCTCAGTTTCCGGCTTACATGAATCCCTATAGCCAGTTTGCTGGTCCCCA 

CCAGTTGCAACAACCTCCTCCTCCTCCATTTCAGGGTCAAACAACATCAC^ 

CGGGCAGGCAAGTAGTAGCAAGGAACCTGAGGATCAGGAGAACCAACCAACAGCTTAGTT 

AAAGTGTGGAGCTGAAACGGATCAGTTCTTCAAGCAAATTACAACTTTGAAGATAAACCA 

GAGTTGTAACATGTAGATTTTGTCTGTTAAGTTTAATGTAAGTACTTTTTAGTTAATGGG 

AAAGATACTGACAGGTTGCAAGGTGGTCAGTATTTGTGCATCACGCTTAAGATTCCTCGA 

TGTGGCC^GTATCTCCCTTTTCTAGCATGTGAGGTCCCTACTCTCTGGTTCTACGGAGAC 

CAAATGTTCGACTGATTAAACACACAATGACTTAC CAAAAGTACACGCGG CC CATC CTCG 

TCTTTATGTTCCAAGTGCGACTGTTTGTTTATTTGTAAGCATTTTTCTTATAATAATAAA 

ACAGCTCTATCTTCGTTAAAAAAAA 

>G1943 Amino Acid Sequence (domain in AA coordinates: 335-406) 

MGEDDIVEIjLWKSGQWRTSQTQRPSSNTPPSLPPPPILRGSGSGNGEENAPIjPLPQPSP 

PLHHQOTjFIIiEDEMSSWIiHHSHPGWSTPASSVSLPPPPNAPREDDIVEIiIiWQSGQVVGT 

NQTHRQSYDPPPILRGSGSGRGEENAPLSQPPPHLHQQNLFIQEGEMYSWIiHHSYRQNYF 

CSELLNSTPATHPQSSISLAPRQTIATRRAENFMNFSWLRGNIFTGGRVDEAGPSFSVVR 

ESMQVGSNTTPPS S SATE S CVI P ATEGTASRVSGTLiAAHDLGRKGKAVAVEAAGTPS SGV 

CKAETEPVQIQPATESKLKAREETHGTEE^ARGSTSRKRSRTAEMHNIjAERRRREKINEKM 

KTLQQLIPRCNKVESDSVSTLISLLKFQRWMMLSSTSN 

GKS S YVS S F VEMMSTGQGMMS PMMNAGNTQQFMPHMAMDMNRPPPF IPF pgts fpmpaqm 
AGVGPSYPAPRYPFPNXQTFDPSRVRLPSPQPNPVSNQPQFPAYMNPYSQFAGPHQLQQP 
PPPPFQGQTTSQLSSGQASSSKEPEDQENQPTA* 
>G21 (79.. 966) 

TGTGGAGGAATATTAATACAGCCCACTTCACATCTATTTTTGTGCAACCATCTCTCTAAA 
GCTTCTTCTCTCATAACAATGGCAAGACAAATCAACATAGAGAGTAGTGTTTCTCAAGTT 
ACCTTTATCTCCTCCGCCATCCCCGCCGTATCTTCCTCCTCCTCCATCACCGCTTCCGCC 
TCATTGTCCTCTTC^CCTACTACATCTTCCTCTTCTTCGTCATCAACAAATTCTAACTTC 
ATTGAGGAAGACAACTCTAAAAGAAAAGCATCTCGAAGATCATTGTCATCGTTAGTCTCC 
GTTGAAGACGATGATGATCAAAACGGTGGAGGTGGGAAACGGCGAAAGACCAACGGTGGA 
GATAAACATCCGACGTATAGAGGAGTGAGGATGAGGAGTTGGGGAAAATGGGTGTCGGAG 
ATTAGAGAGCCGAGAAAGAAATCAAGAATCTGGCTCGGGACTTATCCAACGGCTGAGATG 
GCAGCTCGAGCTCATGACGTAGCGGCTTTAGCCATTAAAGGTACAACGGCTTACCTCAAT 
TTTCCCAAGTTAGCCGGCGAGCTTCCTCGTCCAGTCACAAATTCrCCTAAAGACATrCAA 
GCCGCCGCCTCTTTAGCGGCCGTTAACTGGCAAGATTCGGTCAACGATGTGAGTAATTCT 
GAAGTGGCTGAAATAGTTGAAGCCGAGCCGAGTCGAGCCGTGGTGGCTCAGTTGTTTTCT 
TCGGACACAAGCACGACGACGACGACTCAGAGTCAAGAGTATTCGGAAGCTTCGTGTGCT 
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TCGACTTCGGCGTGTACGGACAAAGACAGTGAGGAAGAGAAGCTGTTTGATTTGCCGGAT 
TTGTTTACCGATGAGAATGAGATGATGATACGAAACGATGCGTTTTGCTACTACTCGTCC 
ACGTGGCAGCTTTGTGGAGCCGATGCTGGGTTTCGGCTTGAAGAGCCGTTTTTTCTATCT 
GAATGACTAAAGTACCCCTCTCGAGAGAGCTCTCACTAACACT 

>G21 Amino Acid Sequence (domain in AA coordinates: 97-164) 

MARQINIESSVSQVTFISSAIPAVSSSSSITASASLSSSPTTSSSSSSSTNSNFIEEDNS 

KRKASRRSLSSLVSVEDDDDQNGGGGKRRKTNGGDKHPTYRGVRM^ 

KSRI WLGTYPTABMAARAHDVAALAIKGTTAYIiNFPKLAGEIjPRPVTNS PKD IQAAASLA 
AVNWQDSVNDVSNSEVAEIVEAEPSRAWAQLFSSDTSTT^ 

DKDSEEEKLFDLPDIjFTDENEMMIRNDAFCYYSSTWQLCGADAGFRLEEPFFLSE* 
>G2132 (42.. 1031) 

ATTCTGTTACTTAGTACCGGAGTTTAGTCGGAGAGAGAACAATGATCAGTTTCAGAGAAG 

AGAACATCGATCTCAACTTGATTAAAACAATTAGTGTAATCTGTAATGATCCAGACGCCA 

CCGATTCCTCTAGCGACGATGAATCTATCTCCGGCAATAATCCTCGCCGTCAGATCAAAC 

CAAAACCACCGAAACGTTACGTCTCAAAGATCTGTGTCCCGACGCTGATCAAAAGGTATG 

AGAACGTTTCGAATTCTACAGGGAATAAAGCAGCCGGAAACCGGAAAACGTCGTCGGGTT 

TCAAAGGCGTACGACGGAGGCCGTGGGGGAAATTTGCGGCGGAGATAAGAAATCCGTTTG 

AGAAGAAGAGAAAGTGGCTTGGAACGTTTCCTACTGAAGAAGAAGCAGCAGAAGCTTACC 

AAAAGAGTAAAAGAGAGTTTGATGAACGATTGGGTTTAGTTAAACAGGAAAAAGACCTAG 

TAGATTTGACCAAGCCGTGCGGTGTACGTAAACCAGAAGAGAAGGAAGTTACTGAGAAGT 

CGAATTGCAAAAAGGTAAATAAGAGAATTGTTACTGATCAGAAGCCATTTGGTTGTGGTT 

ATAACGCTGATCATGAAGAAGAGGGAGTGATTAGTAAAATGTTGGAAGATCCGTTGATGA 

CATCGTCAATTGCTGATATTTTTGGTGATTCGGCTGTTGAAGCAAATGATATTTGGGTGG 

ATTACAATTCAGTGGAATTTATTTCCATTGTAGATGATTTCAAGTTTGATTTTGTGGAGA 

ATGATAGAGTAGGAAAGGAGAAAACATTTGGATTTAAGATTGGGGATCACACTAAAGTTA 

ATCAACATGCCAAAATCGTATCGACCAATGGGGACTTATTCGTCGATGATTTACTTGATT 

TTGATCCGTTGATAGATGATTTTAAGTTAGAAGATTTTCCTATGGATGATCTTGGATTAT 

TAGGAGATCC^GAGGATGATGATTTTAGTTGGTTTAATGGTACTACTGATTGGATCGATA 

AGTTTTTATGAATACTTTCTTGACACGGCCAACGGTATTAGTAC 

>G2132 Amino Acid Sequence (domain in AA coordinates: TBD) 

MISFREENIDLNLIKTISVICNDPDATD^ 

TLIKRYE3WSNSTGNKAAGIIRKTS SGFKGVRRRPWGKFAAE IRNPFEKKRKWLGTFPTEE 

EAAEAYQKSKREFDERLGLVKQEKDLVDLTKPCGVRKPEEKEVTEKSNCKKVNKRIVTO 

KPFGCGYIIADHEEEGVISKMIjEDPLMTSSIADIFGDSAVEANDIWVDYN 

KFDFVENDRVGKEKTFGFKIGDHTKVNQHAKIVSTNGDLFVDDLLDFDPLIDDFKLEDFP 

MDDLGLLGDPEDDDFSWFNGTTDWIDKFL* 

>G2145 (1..777) 

ATGGACGTTTTTGTTGATGGTGAATTGGAGTCTCTCTTGGGGATGTTCAACTTTGATCAA 
TGTTCATCATCTAAAGAGGAGAGACCGCGAGACGAGTTGCTTGGCCTCTCTAGCCTTTAC 
AATGGTCATCTTCATCAACATCAACACCATAACAATGTCTTATCTTCTGATCATCATGCT 
TTCTTGCTCCCTGATATGTTCCGATTTGGTGCAATGCCGGGAGGAAATCTTCCGGCCATG 
CTTGATTCTTGGGATCAAAGTCATCACCTCCAAGAAA 

CTTGACGTGGAGAATCTATGCAAAACTAACTCTAACTGTGACGTCACAAGACAAGAGCTT 

GCGAAATCCAAGAAAAAACAGAGGGTAAGCTCGGAAAGCAATACAGTTGACGAGAGCAAC 

ACTAATTGGGTAGATGGTCAGAGTTTAAGCAACAGTTCAGATGATGAGAAAGCTTCGGTC 

ACAAGTGTTAAAGGCAAAACTAGAGCCACCAAAGGGACAGCCACTGATCCTCAAAGCCTT 

TATGCTCGGAAACGAAGAGAGAAGATTAACGAAAGGCTCAAGACACTACAAAACCTTGTG 

CCAAACGGGAGAAAAGTCGATATAAGGACGATGCTTGAAGAAGCGGTCCATTACGTGAAG 

TTCTTGCAGCTTCAGATTAAGTTGTTGAGCTCGGATGATCTATGGATGTACGCACCATTG 

GCTTACAACGGCCTGGACATGGGGTTCCATCACAACCTTTTGTCTCGGCTTATGTGA 

>G2145 Amino Acid Sequence (domain in AA coordinates : 166-243) 

MDVFVDGELESIiLGMFNFDQCSSSKEERPRDELLGLSSLYNGHLHQHQHHNlT^SSDHHA 

FLLPDMFPFGAMPGGNLPAMLDSWDQSHHLQETSSLK^^ 
AKSKKKQRVSSESin^VDESNTNWVDGQSLSNSSDDEKASVTSVKGKTRATK^ 
YARKRREKINERI*KTLQNLVPNGTKVDI STMLEEAVHYVKFLQLQ ikllssddlwmyapl 
AYNGLDMGFHHNLIiSRIiM* 
>G23 (22.. 732) 
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TATCAAACGAGAGTACAAAAGATGACGTCACTCAACAGCTCTGCATCACCAACATCATCG 
TCATCAGACCAATCTGATGCAACTACTACAACAAGCTVCCCACrTC 
CCACCC^GAAAC^CAACACAAGAAAGAGAAGGAGAGATTCTTCTTCTG 
TCTTCAATGCAAGATCCTGTTTAC^^ 

TCCGAGATCCGACAACCTCGTAAGAAAACTCGTATTTGGCTCGGCACTTTTGTCACCGCT 
GATATGGCTGCTCGTGCTCACGACGTCGCTGCTCTCACCATCAAAGGCTCCTCCGCCGTC 
TTAAATTTCCCTGAGCTTGCTTCTCTCTTCCCTCGTCCGGCGTCATCATCGCCGCATGAT 
ATCCAGACAGCCGCCGCAGAAGCCGCCGCCATGGTGGTCGAAGAAAAACTGTTAGAGAAG 
GATGAGGCTCCGGAGGCCCCACCTTCGTCGGAATCTTCTTACGTGGCGGCX3GAGTCAGAG 
GATGAGGAGAGGTTGGAGAAAATTGTGGAGCTGCCTAACATTGAAGAAGGAAGTTATGAC 
GAGAGTGTGACATCACGTGCTGATCTGGCTTATTCTGAGCCGTTCGATTGTTGGGTGTAT 
CCTCCGGTTATGGATTTTTATGAAGAAATATCGGAGTTTAATTTCGTGGAATTGTGGAGC 
TTTAATCACTAATTT^AGTTAGGAAAGTGCATTATATTGCAATATTGCATCATAGATAACA 
TTTGTATTTCTTTTCTTTTTGTACGGATACGTAGCATATGCTACTATACTAGGGCTAGTG 
TACCAAATATTGTAAAATATACTTATTAATATTTATGTAAATGTGTAATATATATAACAT 
ACAATTATTGTAAGTTTGGAAATTGGAAACTATCGTTACGCAATGTTCTTGTAAAAAAAA 
AAAAAAAAAA 

>G23 Amino Acid Sequence (domain in AA coordinates: 61-117) 

MTSLNSSASPTSSSSDQSDATTTTSTHLSEEEAPPRNNOT 

YRGVKMRSWGKWVSEIRQPRKKTRIW 

SLFPRPASSSPHDIQTAAAEAAAMVVEEKLLEKDEAPEAPPSSESSYVAAESEDEERIjEK 
IVELPNIEEGSYDESVTSRADLAYSEPFDCWVYPPVMDFYEEISEFNFVEIiW 
>G2313 (1.04.. 724) 

CGTCGACACAATCGCTCTTCCGTAACATATTCCAC^AACGATCTTCTTGTTTCTTGAAT 

TTTTAGCCATCTCTTTTTTTTTTTTCTCATTTTCTCGGATACTATGGCTTCGAGTCCACG 

CTGGACGGAGGACGACAACAGGCGTTTTAAGTCAGCTCTGTCGCAATTCCCTCCGGATAA 

CAAGCGTTTGGTGl^ATGTCGCCCAGCATCTGCCGAAACCTTTGGAGGAGGTGAAGTACTA 

CTACGAAAAGTTGGTCAACGATGTTTATCTGCCGAAACCTTTAGAGAATGTCACCCAGCA 

TCTGCAGAAACCTATGGAAATGGAGGAGATGAAGTACATGTACGAAAAGATGGCCAACGA 

TGTTAATCAGATGCCCGAGTACGTACCACTGGCGGAATCGAGTCAGTCCAAACGCAGGAA 

GAAGGATACGCCAAATCCTTGGACAGAAGAGGAACACAGATTGTTTCTGCAAGG 

AAAGTATGGGGAAGGAGCTTCGACGTTGACATCAACGAATTTTGTGAAGACAAAGACTCC 

ACGGCAAGTGTCAAGCCATGC^CAGTATTACAAAAGGCAAAAATCGGACAATAAGAAGGA 

GAAACGCCGGAGTATTTTTGACATAACTTTGGAGTCTACCGAGGGCAATCCAGATTCTGG 

AAATCAGAACCCTCCGGATGATGATGATCCGTCCCAAGGTCAAGGCACTTGTCTTGGAGT 

TTAGATGTTGGAAGATAGAAGAATGGTGTGAAAGC 

>G2313 Amino Acid Sequence (domain in AA coordinates: TBD) 

MAS S PRWTEDDNRRFKS ALS QF P PDNKRL VNVAQHLP KPLEE VKYYYE KL» VND VYL P KPL 

ENVTQHLQKPMEMEEMKYMYEKMANDVNQMPEYVPI^^ 

FLQGLKKYGEGASTLTSTNFVKTKTPRQVSSHAQYYK^^ 

GNPDSGNQNPPDDDDPSQGQGTCLGV* 

>G2344 (1..573) 

ATGACTTCTTCAATCCATGAGCTTTCTGATAACATTGGAAGTCATGAGAAGCAAGAACAG 
AGAGATTCTCATTTCCAACCACCAATCCCTTCTGG^ 

AGTTTAGTCTACTCAGACCCGGGGACTACAAATTCCATGGCACCTGGACAATATCCATAT 
CCAGATCCTTACTACAGAAGCATATTTGCACC 

CTACAGTTGATGGGAGTGCAGCAACAAGGCGTTCCTTTACCATCTGATGCAGTCGAGGAA 
CCTGTTTTTGTTAAeGCAAAGCAATACCACGGTAT^ 

AGACTTGAGTCTCAGAATAAAGTCATCAAGTC^CGTTUVGCCGTATTTGCATGAATCTCGG 
CATTTGCATGCGATAAGACGACCAAGAGGATGTGGCGGGCGGTTTCTAAATGCCAAGAAG 
GAGGATGAGCATGACGAAGACAGTAGTCATGAAGAAAAATCCAACCTTAGCGCTGGTAAA 
TCCGCCATGGCTGCTTCTAGTGGTACATCTTGA 

>G2344 Amino Acid Sequence (domain in AA coordinates: TBD) 
MTSSIHELSDNIGSHEKQEQRDSHFQPPIPSARlTJifESIVTSLVYSDPGTTNSMAPGQYPY 
PDPYYRSIFAPPPQPYTGVHLQLMGVQQQGVPLPSDAVEEPVFVNAKQYHGILRRRQSRA 
RLESQNKVIKSRKPYLHESRHIiHAIRRPRGCGGRFLNAKKEDEHHEDSSHEE 
SAMAASSGTS * 
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>G2430 (69.. 1907) 

AACTTCAAGATACAGATAATCTCTCACTTAAAAAT^ 

CAATTCCAATGTTGGT6GGAAAGATAAGTGGATATGAAGATAATACTCGCTCTTTGGAGC 
GAGAAACATCTGAAATCACTTCTCTTCTCAGCCAATTTCCGGGGAATACTAATGTCCTTG 
TTGTTGACACCAATTTCACCACTCTACTCAACATGAAACAAATCATGJ^ 
ATCAAGTGTCTATTGAGACAGATGGAGAAAAAGCTCTTGC^ 

ATGAAATCAATATTGTGATTTGGGATTTTCATATGCCTGGAATTGATGGACTTCAAGCTC 

TCAAGAGCATTACTTCAAAGTTGGATTTACCTGTAGTGATTATGTCTGATGATAATCAAA 

CGGAATCTGTGATGAAAGCAACATTTTACGGTGCTTGTGACTATGTTGTGAAACCGGTTA 

AAGAAGAGGTAATGGCCAATATATGGCAACACATTGTACGGAAGAGGCTGATCTTTAAAC' 

CGGATGTTGCTCCACCGGTTCAATCAGATCCGGCTCGCTCTGACCGTTTAGACCAAGTCA 

AAGCTGATTTCAAGATCGTAGAAGATGAACCAATAATCAATGAGAC^ 

GGACCGAAGAAATTCAACCGGTTCAGTCAGATCTGGTTCAAGCCAACAAGTTCGACCAAG 

TGAATGGCTATTCCCCAATCATGAACCAAGATAACATGTTCAACAAAGCACCACCTAAAC 

CGCGAATGACGTGGAGAGAAGTTATTCAACCGGTTCAATO^ 

AGTTCGGCCAACTCAATGACTATTCCCAAATCATGAACCAAGATAGCATGTACAACAAAG 
CAGCAACCAAACCACAATTGACGTGGACCGAAGAAATTC^^ 

TTC AAGCCAACGAGTTCAGCAAAGTGAATGGATATTCC CAAAGCATGAAC CAAGATAG CA 

TGTTCAACAAATCAGCAACCAACCCGCGATTGACATGGAACGAATTACTTCAACCGGTTC 

AATCAGATCTGGTTCAATCCAATGAGTTTAGCCAATTC^GTGACTATTCTCAAATCATGA 

ACGAAGATAACATGTTCAACAAAGCAGCAAAGAAACCGCGGATGACATGGAGTGAAGTAT 

TTCAACCGGTTCAATCACATCTGGTTCCGACTGACGGTTTAGACCGAGACCACTTTGATT 

CCATAACCATAAACGGAGGTAACGGCATACAAAACATGGAAAAGAAACAAGGAAAAAAAC 

CACGGAAGCCGCGGATGACGTGGACCGAAGAGCTTCACCAAAAATTTCTGGAAGCCATCG 

AAATAATTGGTGGTATCGAAAAAGCTAACCCAAAGGTACTTGTCGAATGCTTGCAAGAAA 

TGAGGATAGAAGGAATTACTAGAAGCAATGTGGCAAGTCATCTTCAGAAACACCGTATCA 

ATCTTGAAGAAAACGAAATTCCTCAAC^AACACAAGGGAATGGTTGGGCCACTGCGT 

GTACACTAGCTCCCTCTCTCCAAGGTTCAGACAATGTCAACACAACAATACCATCGTAC^ 

TTATGAATGGTCCZAGCCACTTTGAACCAAATCCAGCAGAATCAATATCAAAATGGTTTCT 

TGACAATGAACAACAACCAGATCATAACCAATCCTCCGCCTCCTTTGCCCTATTTGGACC 

ATCATCACCAACAGCAACATCAGTCTTCTCCTCAATTTAATTACCTGATGAACAATGAAG 

AACTTCTTCAAGCCTCTGGCCTCTCTGCGACAGATCTTGAACTCACTTATCCAAGTTTAC 

(^TATGATCCAGAAGAGTATCTAATGAATGGCTACAATTATAATTAGTCATATAGCCCTT 

CTCTTTACTTAAGGCAGTCTATGTATGACAAATAATATGCGACTTCCCTTGTGAGTCACA 

ATATTGTTTCATTATTC 

>G243 0 Amino Acid Sequence (domain in AA coordinates : 425-478) 
MLVGKISGYEDNTRSLERETSEITSLLSQFPGNTNVLVTO 

SIETDAEKAIjAFLTSCKHEINIVIWDFHMPGIDGLQALKSITSKLDLPVVIMSDDNQTES 

VMKATFYGACDYVVKPVKEEVMANIWQHIVRKRIiIFKPDVAPPVQSDPAR 

FKIVEDEPliNETPLITWTEEIQPVQSDLVQANKFDQVNGYSPII^QDNMFNKAPPKPRM 

TWTEVIQPVQSNLVQTKEFGQLNDYSQIIV^^ 

NEFS K^GYSQSMNQDSMFNKS ATNPRLTW^ 

NMFNKAAKKPRMTWSEVFQPVQSHIjVPTIX3LDRPHFD S I T INGGNG IQNMEKKQGKKPRK 
PRMTWTEELHQKFLEAIEIIGGIEKANPKVL^ 

ENQ I PQQTQGNGWATAYGTIiAPSIjQGSDNVNTTI PS YLMNGPATLNQ I QQNQYQNGFIiTM 
NNNQIITNPPPPLPYLDHHHQQQHQSSPQFNYIiMNNEEIjLQASGLSATDIjE 
PQEYLINGYNYN* 
>G2517 (66.. BBS) 

TCCTCACTCTCTCTCTTTTTCTCTAACCATAAAATCTCTTTGATCTCTTTCTCTGTGTTT 
TGATAATGGAAAATGTTGGTGTTGGGATGCCGTTTTACGATTTAGGGCAAACAAGGGTTT 
ACCCACTCTTGTCTGATTTCCACGATTTATCGGCGGAGAGGTATCCGGTAGGGTTCATGG 
ATTTACTGGGTGTTCATCGTCATACACCCACCCATACGCCGTTGATGCATTTTCCGACCA 
CACCTT^ACTCGTCCTCGAGCGAAGCTGTGAATGGAGATGACGAAGAAGAAGAAGATGGAG 
AAGAACAGCAG CA.TAAGACAAAGAAGCGGTTTAAATTCACTAAAATGAGTAGAAAG CAGA 
CGAAGAAGAAGGTGCC^AAAGTGTCATTCATCACGAGGAGTGAGGTTCTTCATCTAGATG 
ATGGTTATAAGTGGAGAAAATACGGTCAAAAACCTGTCAAAGACAGCCCTTTTCCAAGAA 
ATTATTACCGTTGC^VCAACAACTTGGTGTGACGTGAAGAAGAGAGTAGAGAGATCATTCA 
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GTGATCCAAGCAGTGTAATCACCACTTACGAAGGTCAACATACTCATCCTCGTCCACTAC 
TCATCATGCCCAAAGAAGGCAGCTCTCCATCCAATGGCTCAGCTTCTAGGGCCCACATTG 
GCCTCCCTACACTCCCTCCTCAGCTTTTAGATTACAACAACCAACAACAACAAGCGCCGT 
CTTCTTTTGGAACCGAGTACATTAACAGGCAAGAAAAAGGAATTAATCATGATGATGATG 
ACGATCATGTTGTGAAGAAGAGTCGAACTCGGGATCTGCTGGATGGAGCTGGTTTAGTCA 
AAGATCZATGGCCTTCTTCAGGATGTTGTTCCCTCTCATATCATTAAGGAAGAGTATTAGT 
TAATCGCATAATTATGTAGCTAGCTAGCTAG 

>G2517 Amino Acid Sequence (domain in AA coordinates: TBD) 
MENVGVGMPFYDLGQTRVYPLLSDFHDLSAERYPVGFI^LLGVHRHTPTHTPLMHFPTTP 
NSSSSEAVNGDDEEEETCEEQQHKTKKRFKFTKMSRKQTKKKVPKVSFIT^ 
YKWRKYGQKPVKDS PFPRNYYRCITTWOTVKKRVERS 

MPKEGSSPSNGSASRAHIGLPTLPPQLLDYNNQQQQAPSSFGTEYINRQEKGINHDDDDD 

HWKKSRTRDLLDGAGIiVKDHGLLQDWPSHIIKEEY* 

>G2521 (103.. 768) 

ATTCTCC^CAA.TTTCATAACTTTCTTCCGCTCAACTTCAGATAAATTCGGATTCTGTAGC 
TCTTTCAATACGACTGCGGAGATCAGAGCCAATTATTTGGTTATGGCGTCTCTGATCTCA 
GATAOTGAACCGCCGACGAGTACTACITCAGATCTCGTTCGGAGAAAGAAGAGATCCTCT 
GCTTCATCCGCCGCATCGTCTCGTTCAAGCGCATCTTCCGTCTCCGGTGAGATTCACGCG 
CGATGGCGATCGGAGAAGCAACAACGGATCTACTCAGCCAAACTGTTCCAAGCGCTCCAA 
CAAGTCCGCCTGAA.CTCTTCCGCCTCAACATCATCATCTCCAACGGCTCAGAAACGAGGA 
AAGGCCGTCCGTGAAGCCGCCGATCGAGCTCTTGCCGTTTCCGCTCGGGGAAGAACACTC 
TGGAGCAGAGCGATCTTAGCTAATCGGATCAAACTGAAATTTCGTAAACAGAGACGTCCT 
CGAGCTACGATGGCGATTCCGGCCATGACTACGGTGGTTAGTAGCAGCAGCAACAGATCG 
AGAAAACGGAGAGTGTCGGTGTTGAGATTGAATAAGAAGAGTATACCGGATGTTAACCGG 
AAAGTACGTGTTCTAGGCCGGTTAGTTCCCGGTTGCGGTAAACAATCCGTACCGGTGATT 
CTAGAAGAAGCAACTGATTATATTCAGGCTCTGGAGATGCAAGTGAGAGCCATGAACTCT 
TTAGTTCAGCTTCTCTCCTCCTACGGCTCAGCTCCTCCACCGATTTGATGAGGTTAAAAT 
CGTCTTTTTAATTCTACCATCTCTCGATCT^ 

GTTTGATTATAATCTGTAACTACTCTTCCCAACCGCTGATTCTTCTCTGCTACAAGTAAA 

AGTAAATTTTGAACCGAGTCTTCCCATTTTTACGATCCTCAAGTCTAAATTAAG 

ATTGATTAATAAAGTCTTTACCATTAGGGTTC 

>G2521 Amino Acid Sequence (domain in AA coordinates: 145-213) 
MASLISDIEPPTSTTSDLVRRKKRSSASSAASSRSSASSVSGEIHARWRSEKQQRIYSAK 
LFQALQQVRLNS S AS TS S S PTAQKRGKAVREAADRALAVSARGRTLWSRAILANRI KLKF 
RKQRRPRATMAI PAMTTWS S S SNRSRKRRVS VLRL.NKKS I PD VNRKVRVLGRLVPGCGK 
QSVPVIIiEEATDYIQALEMQVRAMNSLVQLLSSYGSAPPPI* 
>G258 (60.. 983) 

AGTGACCACCCTGCTGGTTAATCAACACCAAGAGACCTTGTAATATATAAGTTAGGAAGA 
TGAGAGAGAAGTGGGAAATGAAAAGAGATGAAATGGGACATCGATGTTGTGGAAAACACA 
AAGTGAAGAGAGGTCTTTGGTCTCCAGAGGAAGACGAGAAGCTTCTTCGTTATATCACCA 
CTCATGGTCATCCTAGTTGGAGTTCCGTTCCAAAGCTTGCCGGGTTGCAGAGATGTGGGA 
AGAGTTGCAGATTAAGGTGGATAAACTATCTAAGGCCTGATCTGAGGAGAGGTTCGTTTA 
ATGAGGAAGAAGAGCAGATTATCATCGACGTACATCGTATTCTTGGTAACAAATGGGCTC 
AGATTGCTAAGGACTTACCTGGACGCACTGATAATGAAGTC^^ 

GCATTAAGAAGAAACTTCTTTCTCAAGGCTTAGATCCTTCTACACATAATCTTATGCCTT 

CAGACAAAAGATCTTCTTCTTCAAACAATAATAATATCCCCAAGCCAAACAAAACGACGT 

CCATCATGAAGAACCCTACTGATCTTGATCAATCAACCACTGCTTTTTCAATCACAAACA 

TCAATCCACCC7^CTTCCACTAAACCAAA.CAAACTTAAATCTCCTAACQ 

CATCTCAAACCGTGATCCCTATCAATGATAACATGTCAAGTACTCAAACCATGATCCCTA 

TC?UVTGATCCCATGTeAAGTCTTTTAGATGATGAGAATATGATTCCTCACT 

TTGATGGAATGGCGATCCACGAAGCTCCGATGTTGCCTAGTGATAAGGCAGTAGTGGGAG 

TGGATGATGATGATCTCAACATGGAC^TTTTGTTTAACACTCCTTCTTCTTCTGCTTTTG 

ATCCTGATTTTGCTTCCATTTTCTCCTCTGCAATGTCTATCGATTTCAATCCCATGGATG 

ATCTTGGCAGCTGGACCTTTTAGCTTTTACTCTACAGC 

>G258 Amino Acid Sequence (domain in AA coordinates: 24-124) 
MREKWEMKRDEMGHRCCGKHKVKRGLWSPEEDEKLLRY 

KSCRLRWINYLRPDIjRRGS FNEEEEQI I IDVHRIIjGNKWAQI AKHLPGRTDNEVKNPWNS 
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CIKKKLLSQGLDPSTHNLMPSHKRSSSSNN1WIPKPNKTTS IMKNPTDIiDQSTTAFS ITN 

INPPTSTKPNKLKSPNQTTIPSQTVIPINDNMSSTQTMIPINDPMSSI^DDENMIPHWS 

VDGI^HEAPMLPSDKAWGVDDDDLK^ 

DI*GSWTF* 

>G280 (108.. 722) 

AAGTTAATATGAGAATAATGAGAAAACCACTTTCCCAAATTGCTTTTTAAAATCCCTCCT 
CACACAGATTCCTTCCTTCATCACCTCACACACTCTCTACGCTTGACATGGCCTTCGATC 
TCCACCATGGCTCAGCTTCAGATACGC^TT(^ 

CTTATCCTCAGATGATAATGGAAGCGATTGAGTCCTTGAACGATAAGAACGGCTGCAACA 
AAACGACGATTGCTAAGCACATCGAGTCGACTCAACAAACTCTACCGCCGTCACACATGA 
CGCTGCTCAGCTACCATCTCAACCAGATGAAGAAAACCGGTCAGCTAATCATGGTGAAGA 
ACAATTATATGAAACCAGATCCAGATGCTCCTCCTAAGCGTGGTCGTGGCCGTCCTCCGA 
AGCAGAAGACTCAGGCCGAATCTGACGCCGCTGCTGCTGCTGTTGTTGCTGCCACCGTCG 
TCTCTACAGATCCGCCTAGATCTCGTGGCCGTCCACCGAAGCCGAAAGATCCATCGGAGC 
CTCCCCAGGAGAAGGTCATTACCGGATCTGGAAGGCCACGAGGACGACCACCGAAGAGAC 
CGAGAACAGATTCGGAGACGGTTGCTGCGCCGGAACCGGCAGCTCAGGCGACAGGTGAGC 
GTAGGGGACGTGGGAGACCTCCGAAGGTGAAGCCGACGGTGGTTGCTCCGGTTGGGTGCT 
GAATTAATCGGTACTTATGCAATTTCGGAATCTTTAGTTACTGAAAAATGGAATCTCTTA 
GAGAGTAAGAGAGTGCTTTAATTTAGCTTAATTAGATTTATTTGGATTTCTTTCAGTATT 
TGGATTGTAAACTTTAGAATTTGTGTGTGTGTTGTTGCTTAGTCCTGAGATAAGATATAA 
CATTAGCGACTGTGTATTATTATTATTACTGCATTGTGTTATGTGAAACTTTGTTCTCTT 
GTTGAAAAAAAAAAAAAAAAAAA 

>G280 Amino Acid Sequence (domain in AA coordinates: 97-104,130-137-155-162,185- 
192) 

MAFDLHHGSASDTHSSELPSFSLPPYPQMIMEAIESLNDKNGCNKTTIAKHIESTQQTLP 
PSHMTLLSYHLNQMKKTGQLIMVKIINYMK^ 

AATWSTDPPRSRGRPPKPKDPSEPPQEKVITGSGRPRGRPPKRPRTDSETVAAPEPAAQ 
ATGERRGRGRPPKVKPTWAPVGC * 
>G3 (16.. 477) 

GTTTGTCTTTTATCAATGGAAAGAGAACAAGAAGAGTCTACGATGAGAAAGAGAAGGCAG 
CCACCTC^U^GAAGAAGTGCCTAACCACGTGGCTACAAGGAAGCCGTACAGAGGGATACGG 
AGGAGGAAGTGGGGCAAGTGGGTGGCTGAGATTCGTGAGCCTAACAAACGCTCACGGCTT 
TGGCTTGGCTCTTACACAACCGATATCGCCGCCGCTAGAGCCTACGACGTGGCCGTCTTC 
TACCTCCGTGGCCCCTCCGCACGTCTCAACTTCCCTGATCTTCTCTTGCAAGAAGAGGAC 
CATCTCTCAGCCGCCACCACCGCTGACATGCCCGCAGCTCTTATAAGGGAAAAAGCGGCG 
GAGGTCGGCGCCAGAGTCGACGCTCTTCTAGCTTCTGCCGCTCCTTCGATGGCTCACTCC 
ACTCCGCCGGTAATAAAACCCGACTTGAATCAAATACCCGAATCCGGAGATATATAGTCA 
ATTTATATACATGTAGTTTGTTTTGTTTGATTAGAAGATTACATTTACATACAAGATAC^ 
CATAGATACTGGAAAATATAGGTATGTATACATTCATAAATTATCTTATGTATCAAAGAA 
TTTTATAGATTCTGATTAGCTTTTTGTTTTTGTTTTTGATAAGAACTCTGATTAGTTGTC 
CGGAGACAAAACCGGCTAAGAGCAATCCATGAGAAGCTAGCGAGTGTTTTTTAGTTCAAG 
TTGTAATATAAATGCATATTAATTCTTTAGTAATTTTGT 

>G3 Amino Acid Sequence (domain in AA coordinates: 28-95) 
MEREQEESTMRKRRQPPQEEVPNHVATRKPYRGIRRR 

TTD I AAARAYDVAVF YLRGPS ARXiNFPDLIiLQEEDHIjSAATTADMPAAL I REKAAEVGAR 
VDALLAS AAP SMAHS TP P VI KPDIiNQ I PES GD I * 
>G343 (1..795) 

ATGGACGTCTATGG€TTATCTTCACCAGACTTACTTCGAATCGACGACCTTCTTGATTTC 
TCCAACGAAGACATCTTCTCCGCTTCnTCTTCCGGTGGTTCCACCGCCGCTACTTCCTCT 
TCTTCTTTCCCTCCTeCTCAAAACCCTAGTTTCCACCACCACCATCTCCCTTCCTCCGCC 
GATCATCACTCCTTCCTCCACGACATTTGCGTTCCCAGTGATGACGCAGCTCATCTTGAA 
TGGCTTTCGCAATTCGTGGACGATTCTTTCGCTGATTTTCCGGCGAATCCATTAGGAGGA 
ACTATGACTTCTGTCAAAACTGAAACTTCCTTTCCGGGGAAACCAAGAAGCAAACGATCA 
AGAGCTCCTGCTCCTTTCGCCGGAACATGGTCTCCGATGCCACTGGAATCCGAGCATCAG 
CAGCTTCACTCCGCCGCCAAATTCAAGCCAAAGAAAGAACAATCCGGCGGAGGAGGAGGA 
GGAGGAGGAAGACATCAGTCATCGTCATCGGAGACTACGGAAGGAGGAGGAATGAGGAGA 
TGTACTCACTGTGCATCGGAGAAAACGCCACAGTGGAGGACAGGAGCACTTGGACCTAAA 
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ACACTATGTAACGCTTGTGGAGTCCGGTTTAAATCCGGTAGACTTGTACCGGAATATAGA 
CCGGCTTCGAGTCCTACTTTTGTTTTGACTCAGCATTCAAACTCTCACCGGAAAGTGATG 
G AGCTTCGACGGCAGAAAGAAGTTATGAG ACAAC CACAACAAGTTC AACTTC ATCAC C AC 
CACCACCCGTTTTAG 

>G343 Amino Acid Sequence (domain in AA coordinates: 178-214) 

^IDVYGLSSPDLLRIDDIlIJDFSNEDIFSASSSGGSTAATSSSSFPPPQNPSB , HHHHIiPSSA 

DHHSFIJrDICWSDDAAHLEWLSQFVDDSFADFPANPLGGTMTSVKT^ 

RAPAPFAGTWSPMPLESEHQQLHSAAKFKPKKEQSGGGGGGGGRHQSSSSETTO 

CraCASEKTPQWRTGPLGPKTLCNACGVRFKSGRIiVPEYRPAS S PTFVLTQHSNSHRKVM 

ELRRQKEVMRQPQQ VQLHHHHHP F * 

>G363 (1..780) 

ATGAGACCAATATTAGACCTCGAAATTGAAGCTTCATCGGGCAGTAGTAGCAGCCAAGTG 

GCCTCAAACTTGTCTCCGGTTGGGGAAGATTACAAACCAATCTCGCTGAATCTTAGCCTC 

AGTTTCAACAACAACAACAACAATAATCTGGATCTTGAATCATCGTCTTTGACGCTGCCA 

CTTTCGAGCACGAGTGAGAGTAGTAACCCGGAGCAGCAGCAGCAACAACAACCATCTGTA 

TCAAAGAGAGTCTTCTCTTGTAACTACTGCCAAAGGAAGTTCTATAGCTCTCAAGCGCTA 

GGTGGTCACCAAAACGCTCACAAACGTGAGAGAACACTCGCCAAACGCGCTATGCTATGG 

GTCTTGCTGGGGTCTTCCCCGGTAGAGGATCAAGTAGCAATTATGCGGCTGCTGCCACAG 

CAGCCGCTCTCGTGTTTGCCGCTTCACGGAAGCGGAAACGGGAACATGACATCGTTCAGG 

ACTTTGGGAATCCGGGC^CATTCCTCGGCGCACGACGTCAGCATGACAAGGCAGACACCA 

GAAACACTTATTAGAAACATTGCCAGGTTCAACCAGGGGTATTTCGGTAATTGTATACCT 

TTTTACGTGGAGGACGACGAGGCCGAGATGCTCTGGCCGGGGAGTTTCCGGCAAGCTACG 

AATGCGGTTGCGGTTGAAGCGGGTAATGATAATTTAGGTGAAAGAAAAATGGATTTCTTG 

GACGTCAAGCAA.GCGATGGATATGGAAAGTTCTCTTCCAGATCTAACCTTGAAGCTTTG 

>G363 Amino Acid Sequence (domain in AA coordinates: 87-108) 

I^PILDLEIEASSGSSSSQVASI^SPVGEDYKPISIj^ 

LSSTSESSNPEQQQQQQPSVSKRVFSCNYCQRK^YSSQALGGHQNAHKRERTIiAKRAMLW 
VLLGS S PVEDQVAIMRLLPQQPLS CLPIiHGSGNGNMTS FRTLGIRAHS SAHDVSMTRQTP 
ETLIRNIARFNQGYFGNCIPFYVEDDEAEMLWPGSFRQATNAVAVEAGNDNIjGERKMDFIi 
DVKQAMDMES SLPDIjTIjKL * 
>G370 (1. .774) 

ATGGACGAAACCAACGGACGAAGAGAAACTCACGATTTCATGAACGTCAACGTTGAATCC 

TTCTCTCAGCTTCCTTTCATCCGCCGTACTCCTCCC?VAAGAAAAAGCCGCCATTATTCGT 

CTCTTCGGCCAAGAGCTCGTCGGTGATAACTCCGACAACTTATCCGCAGAACCTTCTGAT 

CATCAAACCACTACCAAGAACGATGAGAGCTCTGAGAATATCAAGGACAAAGACAAAGAA 

AAAGATAAGGACAAAGACAAAGATAACAACAACAACAGGAGATTCGAGTGTCACTACTGC 

TTCAGAAACTTCCCAACTTCTCAAGCCCTAGGTGGAGATCAAAACGCTCACAAACGTGA^ 

CGTCAA.CACGCCAAACGCGGTTCCATGAC^TCATACCTTCATCATCATCAGCCTCATGAC 

CCTCACCACATCTACGGCTTCCTCAACAACCA.CCACC^CCGTCACTATCCGTCTTGGACG 

ACGGAAGCTAGATCATACTACGGCGGAGGGGGACATCAAACGCCGTCGTACTACTCAAGG 

AATACTCTTGCTCCTCCTTCTTCTAACCCACCGACAATCAACGGAAGTCCTTTAGGTTTG 

TGGCGTGTACCGCCTTCCACGTCAACAAATACTATTG^GGCGTTTACT<^TCTT(^lCCA 

GCTTCAGCGTTTAGGTCGCATGAGCAAGAGACTAATAAGGAGCCTAATAACTGGCCGTAC 

AGATTGATGAAACCCAATGTGCAAGATCATGTGAGTCTCGATCTTCATCTCTGA 

>G370 Amino Acid Sequence (domain in aa coordinates: 97-117) 

MDETNGRRETHDFMNVNVESFSQ 

HQTTTKNDESSENIKDKDKEKDKDKDKDNNNNRRFECHYCFRNFPTSQALGGHQN 

RQHAKRGSMTSYLHHHQPHDPHHIYGFIJSnra 

OTI^PSSNPPTINGSPLGLWRVPPSTSTNTIQGW 

RLMKPNVQDHVSLDIiHL* 

>G385 (37.. 2202) 

TAGGGTTTGCTTTCAGTTTCCGGAGTATAAGAAAAGATGTTCGAGCCAAATATGCTGCTT 
GCGGCTATGAACAACGCAGAGAGCAATAACCAGAA 

GAAGGATTTCTTCGGGACGATGAATTCGACAGTCCGAATACTAAATCGGGAAGTGAGAAT 
CAAGAAGGAGGATCAGGAAACGACCAAGATCCTCTTCATCCTAACAAGAAGAAACGATAT 
CATCGACACACCCAACTTCAGATCCAGGAGATGGAAGCGTTCTTCAAAGAGTGTCCTCAC 
CCAGATGACAAGCAAAGGAAACAGCTAAGCCGTGAATTGAATTTGGAACCTCTTCAGGTC 
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AAATTCTGGTTCCAAAACAAACGTACCCAAATGAAGAATCATCACGAGCGGCATGAGAAC 

TCACATCTTCGGGCGGAGAACGAAAAGCTTCGAAACGACAACCTAAGATATCGAGAGGCT 

CTTGCAAATGCTTCGTGTCCTAATTGTGGTGGTCCAACAGCTATCGGAGAAATGTCATTC 

GACGAACACCAACTCCGTCTCGAAAATGCTCGATTAAGGGAAGAGATCGACCGTATATCC 

GCAATCGCAGCTAAATACGTAGGCAAGCCAGTCTCAAACTATCCACTTATGTCTCCTCCT 

CCTCTTCCTCCACGTCCACTAGAACTCGCCATGGGAAATATTGGAGGAGAAGCTTATGGA 

AACAATCCAAACGATCTCCTTAAGTCCATCACTC 

ATCATCGACTTATCCGTGGCTGCAATGGAAGAGCTC^^ 

CCTCTGTGGAAGAGTTTGGCTTTAGACGAAGAAGAATATGCAAGGACCTTTCCTAGAGGG 

ATCGGACCTAGACCGGCTGGATATAGATCAGAAGCTTCGCGAGAAAGCGCGGTTGTGATC 

ATGAATCATGTTAACATCGTTGAGATTCTCATGGATGTGAATCAATGGTCGACGATTTTC 

GCGGGGATGGTTTCTAGAGCAATGACATTAGCGGTTTTATCGACAGGAGTTGCAGGAAAC 

TATAATGGAGCTCTTGAAGTGATGAGCGCAGAGTTTC^ 

ACACGTGAAACCTATTTCGCACGTTAOTGTAAACAACAAGGAGATGGT^ 

GTCGATATTTCGTTGGATAGTCTCCAACCAAATCCCCCGGCTAGATGCAGGCGGCGAGCT 

TCAGGATGTTTGATTCAAGAATTGCCAAATGGATATTCTAAGGTGACTTGGGTGGAGCA 

GTGGAAGTTGATGACAGAGGAGTTCATAACTTATACAAA 

GCCTTCGGTGCTAAACGCTGGGTAGCCATTCTTGACCGCCAATGCGAGCGGTTAGCTAGT 

GTCATGGCTACAAACATTTCCn?CTGGAGAAGTTGGCGTGATAACCAACCAAGAAGGGAGG 

AGGAGTATGCTGAAATTGGCAGAGCGGATGGTTATAAGCTTTTGTGCAGGAGTGAGTGCT 

TCAACCGCTCACACGTGGA.CTACATTGTCCGGTACAGGAGCTGAAGATGTTAGAGTGATG 

ACTAGGAAGAGTGTGGATGATCCAGGAAGGTCTCCTGGTATTGTTCTTAGTGCAGCCACT 

TCTTTTTGGATCCCroTTCCTCCAAAGCGAGTCTTTGACTTCCTCAGAGACGAGAATTCA 

AGAAATGAGTGGGATATTCTGTCTAATGGAGGAGTTGTGCAAGAAATGGCACATATTGCT 

AACGGGAGGGATACCGGAAACTGTGTTTCTCTTCTTCGGGTAAATAGTGCAAACTCTAGC 

GAGAGCAATATGCTGATCCTACAAGAGAGCTGCATTGATCCTACAGCTTCCTTTGTGATC 

TATGCTCCAGTCGATATTGTAGCTATGAACATAGTGCTTAATGGAGGTGATCCAGACTAT 

GTGGCTCTGCTTCCATCAGGTTTTGCTATTCTTCCTGATGGTAATGCCAATAGTGGAGCC 

CCTGGAGGAGATGGAGGGTCGCTCTTGACTGTTGCTTTTCAGATTCTGGTTGACTCAGTT 

CCTACGGCTAAGCTGTCTCTTGGCTCTGTTGCAACTGTCAATAATCTAATAGCTTGCACT 

GTTGAGAGAATCAAAGCTTCAATGTCTTGTGAGACTGCTTGAAAACCATCCATTAGC 

>G385 Amino Acid Sequence (domain in AA coordinates: 60-123) 

lYTFEPNI^LAAMNNM 

HPlilKIKKRYHRHTQLQIQEMEAFFKECP 

MEIHERHENSHLRAElTEKIjRiroNLRYREALANAS C PNCGGPTAIGEMS FDEHQLRLENARL 

REEIDRISAIAAKYVGKPVSNYPLMSPPPLPPRPLELAMGNIGGEAYGlsn^ 

PTESDKPVIIDLSVAAMEELMRMVQVDEPLWKSLALDEEEYARTF 

SRES AVVIMNHVNIVE ILMDVNQWSTI FAGMVSRAMTLAVLSTGVAGNYNGALQVMSAEF 
QVPSPLVPTRETYFARYCKQQGDGS WAVVD I SLDS LQPNPPARCRRRASGCIjIQELPNGY 
SKVTWVEHVEVDDRGVHNLYKHMVS TGHAFGAKRWVAILDRQCERIiAS VMATNI S SGEVG 
VITNQEGRRSMIiKLAERMVI S FGAGVS ASTAHTWTTLSGTGAEDVRVMTRKSVDDPGRS P 
GIVLSAATSFWIPVPPKRVFDFLRDENSRNEWDILSNGGVVQEMAHIANGRDTGNCVSIiIi 
■ RVNSANS S QSNML ILQES C IDPTAS FVI YAP VD I VAMNI VLNGGDPD YVAIJjPSGFAIIjP 
DGNANSGAPGGDGGSLLTVAFQILVDSVPTAKLSIiGSVAT^^ 
A* 

>G439 (128.. 967) 

TATAAATCTTCGTTTCTACTTTTTTTTCTT 
AGGGCTTCTTCTCFTTGTTTCTCCAATCTTTAT^ 

TATACAAATGGCAATGGCTTTAAACATGAATGCTTACGTAGACGAGTTCATGGA 
TGAACCATTCATGAAGGTAACTTCATCTTCTT^ 

ATTAACTCCTAATTTCATCCCTAATAATGACCAAGTCTTACCGGTATCTAACCAAACCGG 

TCCGATTGGGCTAAACCAGCTGACTCCAACAGAAATCCTCCAAATTCAGACAGAGTT^ 

TCTCCGGCAAAACCAATCTCGTCGTCGCGCTGGTAGTCATCT[TCTCACCGCTAAACCAAC 

OTCAATGAAGAAAATCGACGTAGCAACTAAACCGGTTAAACTATACCGAGGCGTAAGACA 

GAGGCAATGGGGTAAATGGGTAGCTGAGATTCGGCTACCTAAAAACCGAACCCGGTTATG 

GCTCGGTACGTTCGAAACGGCTCAAGAAGCTGCATTAGCTTACGATC^^GCAGCTCATAA 

GATC^GAGGAGAC^ACGCTCGTCTCAATTTCCCAGAC^TTGTTCGTC^GGACACrATAA 
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ACAGATATTGTCTCCGTCTATCAACGCAAAGATCGAATCCATCTGCAATAGTTCTGATCT 

TCCACTGCCTCAGATCGAGAAACAGAACAAAACAGAGGAGGTGCTCTCTGGTTTTTCCAA 

ACCGGAGAAAGAACCGGAATTTGGGGAGATATACGGATGCGGATACTCGGGCTCATCTCC 

TGAGTCGGATATAACGTTGTTGGATTTCTCAAGCGACTGTGTGAAAGAAGATGAGAGTTT 

CTTGATGGGTTTGC^CAAGTATCCTTCTTTGGAGATTGATTGGGACGCTATAG 

CTTCTGAATCCATTTTATCTTTTTGATT 

AGAGCTTTGTAAGGGAAGTTCTTGAATGAGAGTTGCAGAGGACTAGTGGAACCTAACTCT 

GTTTTCTTTTGTAAGTATTGTTTATAATGGGCCGTTGAATGGGCOT 

GCCCAAGTTTTTAAAAAAAAAAAAAAAAAAAAAAAAA 

>G439 Amino Acid Sequence (domain in AA coordinates: 110-177) 
MAMALNMNAYVDEFMEAI^ 

GLNQIiTPTQILQIQTELHIiRQNQSRRRAGSHLLTAKPTSMKKIDVATKPVKLY 
WGKWVAE IRLPKNRTRLWLGTFETAQEAALAYDQAAHKIRG ivrqghykqi 
LSPSINAKIESICNSSDLPLPQIEKQNKTEEVLSGFSKPEKEPEFGEIYGCGYSGSSPES 
DITLLDFSSDC^/KEDESFLMGLHKYPSLEIDWDAIEKLF* 
>G440 (237.. 1301) 

AAAAAATCACTGTTTCATAACACGTTTTTCTCTCTCACCCACCAAAAAAAAATCTTTTGT 

TCTTGTTACCAAAAAATCTCGTGATAAATCTCTTCAAACTTTGTTTTATTTTCTTCTTG 

TTCTCTCGAAATCTCTCTCAACAAACCCAGAAACTTTCCTTGATTCGCAAGCTTTTCT 

CTTTTATATTCTTCATTTTGATGCGAATATAGAGAGAGTCCATAAAAGAAACAGTAATGG 

ACGAATATATTGATTTCCGACCATTGAAGTAGACAGAGCAG^GACTTCAATGACTAAAT 

ACACCAAAAAGTCATCGGAA7UVACTTTCCGGTGGTAAGTCATTGAAAAAGGTTAGTATTT 

GTTATACTGATCCTGACGCAACAGATTCATCAAGTGACGAAGACGAAGAAGATTTCTTGT 

TTCCTCGCCGGAGAGTCAAAAGATTCGTTAACGAGATCACTGTTGAGCCTAGCTGTAACA 

ACGTCGTCACCGGAGTTTCGATGAAAGATAGAAAGAGACTCTCTTCTTCCTCCGATGAAA 

CTCAATCTCCGGCGTCGAGTCGTCAACGTCCTAATAACAAAGTTTCAGTCTCCGGTCAGA 

TAAAGAAGTTCCGTGGTGTTAGACAACGGCCATGGGGGAAATGGGCGGCGGAGATTAGAG 

ATCCGGAGCAACGTCGGAGGATTTGGCTCGGGACTTTTGAGACGGCGGAGGAAGCTGCCG 

TGGTTTATGATAACGCCGCTATAAGACTCCGTGGACCGGACGCTTTAACTAATTTCTCCA 

TACCGCCTCAAGAAGAGGAAGAAGAAGAAGAACCGGAACCGGTTATTGAGGAGAAACCGG 

TTATTATGACGACGCCAACACCAACAACATCGAGTTCTGAATCAACTGAAGAAGATTTAC 

AACATCTCTCATCTCCTACTTCGGTTCTCAATCACCGGTCAGAAGAGATTCAACAAGTAC 

AACAACCGTTTAAATCAGCTAAACCCGAACCGGGGGTTTCAAATGCACCATGGTGGCATA 

CCGGGTTTAATACCGGTTTAGGTGAATCAGACGATTCATTTCCTTTGGATACTCCGTTTC 

TTGACAACTATTTCAATGAATCACCACCAGAGATGTCAATATTTGACCAACCAATGGATC 

AAATTTTCTGTGAAAATGATGATATCTTCAATGATATGTTGTTCTTGGGTGGTGAAACTA, 

TGAACATTGAAGATGAGTTAACAAGTTCTAGTATCAAAGATATGGGTTCAACGTTTAGTG 

ATTTTGATGATTCATTGATATGAGATCTATTAGTTGCTTAATATGATGATGAGAGTGAAG 

AAGAAACCATCAAGCAAATATCTATGGTGTGACTGAAAAAT^ 

CTTTCATAAGTTCATGAGCTTTTTTGTTTCTTTTTTTTAATAATTTATTTAGTTTTGTCA 
GGAGCTTGTAAAACAGTTTTGGAGAAATAGTGGAAAAATAGTTTAATTAAAAAAAAAAAA 
AAAAAAA 

>G440 Amino Acid Sequence (domain in AA coordinates: 122-189) 

MDEYIDFRPLKY'^HKTSMTKYTKKSSEKLSGGKSLKKVSICYTDPDATDSSSDEDEEDF 

LFPRRRVKRFWEITVEPSCNNVVTGVSMKD^ 

Q I KKFRGVRQRPWGKWAAE PEQRRRI WLGTFETAEEAAVVYDNAAIRLRGPDALTNF 

SIPPQEEEEEEEPEPVIEEKPVIMTTPTPTTSSSESTEEDLQHLSSPTSVLNHRSEEIQQ 

VQQPFKSAKPEPGVSNAPWWHTGFNTGLGESDDSFPI^TPFIjDI^FI^SPPEMSIFDQPM 

DQIFCEBTODIFNDMLFLGGETI^IEDE^ 

>G5 (417. .1421) * 

TTTTTTTTTTGCAATCTCCCCCTAATCTGTTGTTTCTCGCTTCTTCTTCTGTTAATCATC 1 
TGTCTTTCAAAAAGAAAGAAAAAAGAAAAATTCGATT^ 

GAAAAAAATCAAGCTTATGAATTTGTGTTTAATTTTTTGTTTTAATTTGAAAG 
TTTTCAGAACGAGATCGTTTTTTC^AATTTCTTCTGATTTTACCrr 

GATTTTAGTGAATCGAGGGTGAAATTTTTGATTCCCTCTTTTCGGATCTACACAGAGGTT 

GCTTATTTCAAACCTTTTAGATCCATTTTT^ 

TTTACTTTTTTATAAGTCTCAGGTTCA^ 
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CAGCTGCTATGAATTTGTACACTTGTAGCAGATCGTTTCAAGACTCTGGTGGTGAACTCA 
TGGACGCGCTTGTACCTTTTATCAAAAGCGTTTCCGATTCTCCTTCTTCTTCTTCTGCAG 
CGTCTGCGTCTGCGTTTCTTCACCCCTCTGCGTTTTCTCTCCCTCCTCTCCCCGGTTATT 
ACCCGGATTCAACGTTCTTGACCCAACCGTTTTC^^ 

GGTCATTAATCGGACTCAACAACCTCTCTTCTTCTCAGATCCACCAGATCCAGTCTCAGA 

TCCATCATCCTCTTCCTCCGACGCATCACAACAACAACAACTCTTTCTCGAAT^ 

GCCCAAAGCCGTTACTGATGAAGCAATCTGGAGTCGCTGGATCTTGTTTCGCTTACGGTT 

CAGGTGTTCCTTCGAAGCCGACGAAGCTTTACAGAGGTGTGAGGCAACGTCACTGGGGAA 

AATGGGTGGCTGAGATCCGTTTGCCGAGAAATCGGACTCGTCTCTGGCTTGGGACTTTTG 

ACACGGCGGAGGAAGCTGCGTTGGCCTATGATAAGGCGGCGTACAAGCTGCGCGGCGATT 

TCGCCCGGCTTAACTTCCOTAACCTACGTCATAACGGATTTCACATCGGAGGCGATTTCG 

GTGAATATAAACCTCTTCACTCCTCAGTCGACGCTAAGCTTGAAGCTATTTGTAAAAGCA 

TGGCGGAGACTCAGAAACAGGACAAATCGACGAAATCATCGAAGAAACX3TGAGAAGAAGG 

TTTCGTCGCCAGATCTATCGGAGAAAGTGAAGGCGGAGGAGAATT.CGGTTTCGATCGGTG 

GATCTCCACCGGTGACGGAGTTTGAAGAGTCCACCGCTGGATCTTCGCCGTTGTCGGACT 

TGACGTTCGCTGACCCGGAGGAGCCGCCGCAGTGGAACGAGACGTTCTCGTTGGAGAAGT 

ATCCGTCGTACGAGATCGATTGGGATTCGATTCTAGCTTAGGGGCAAAATAGGAAATTCA 

GCCGCTTGC^^TGGAGTTTTTGTGAAATTGC^TGACTGGCCCAAGAGTAATTAATTAAAT 

ATGGATTAGTGTTAAATTTCGTATGTTAATATTTGTATTATGGTTTGTATTAGTCTCTCT 

GTGTCGGTCCAGCTTGCGGTTTTTTGTCAGGCTCGACCATGCCA 

TAATCTTTTTTTCTTTTGTCTTATGTAATTTGTAGCTTCAGTTTCTTCATCTATAATGCA 
ATTTTATTATGATTATGTG 

>G5 Amino Acid Sequence (domain in AA coordinates: 149-216) 
MAAAMNLYTCSRSFQDSGGELMDAIjVPFIKSVSDSPSSSSAASASAFLHPSAFSLPPLPG 
YYPDSTFIiTQPFSYGSDIiQQTGSLIGLNNLSSSQIHQIQSQIHHPLPPTHHNN^^ 
LSPKPLLMKQSGVAGSCFAYGSGVPS KPTKLYRGVRQRHWGKVfVAE 11^ 
FDTAEEAAIAYDKAAYKLRGDFARLNFPNLRHNGFHIG^^ 

SMAETQKQDKSTKSSKKREKKVSSPDLSEKVKAEENSVSIGGSPPVTEFEESTAGSSPLS 
DI>TFADPEEPPQWNETFSLEKYPS YE IDWDS I LA* 
>G550 (1..1374) 

ATGGCTGATCCGGCGATTAAGCTCTTTGGAAAGACGATTCCTTTACCTGAGCTTGGTGTT 
GTTGATTCTTCTTCTAGCTATACCGGATTTTTAACCGAAACTCAGATTCCTGTTCGGTTA 
TCAGATTCGTGTACCGGCGATGATGATGATGAAGAGATGGGTGATTCCGGTTTAGGACGA 
GAAGAAGGTGATGATGTTGGTGATGGTGGAGGAGAGAGCGAGACTGATAAAAAGGAAGAA 
AAAGATAGTGAGTGTCAGGAAGAGTCATTGAGGAATGAATCTAATGATGTTACTACTACT 
ACATCGGGTATAACTGAAAAAACGGAAACAACAAAAGCTGCAAAGACGAATGAAGAGTCA 
GGTGGTACTGCTTGCTCTCAAGAGGGGAAGTTAAAGAAACCTGATAAGATTCTACCGTGT 
CCGCGATGTAACAGCATGGAAACCAAGTTCTGTTACTACAACAACTATAATGTTAACCAA 
CCTCGCCATTTCTGCAAGAAATGTCAGAGATATTGGACAGCTGGTGGAACGATGAGGAAT 
GTTCCGGTTGGTGCTGGGAGACGTAAGAATAAGAGTCCAGCTTCTCATTATAACCGTCAT 
GTAAGTATAACATCTGCGGAAGCTATGCAGAAGGTGGCGAGAACTGATCTTCAACATCCT 
AATGGTGCAAATCTTCTCACTTTTGGCTCTGATTCTGTGCTTTGTGAATCTATGGCTTCT 
GGATTGAATCTTGTTGAGAAGTCATTGTTGAAGACACAAACTGTATTGCAAGAACCCAAT 
GAAGGCTTGAAGATTACGGTTCCGTTAAACCAGACAAACGAAGAAGCTGGAACAGTCAGC 
CCGTTACCAAAAGTTCGATGCTTTCCAGGACCACCACCAACTTGGCCTTACGCTTGGAAC 
GGAGTTTCGTGGACGATTTTACCGTTTTACCCTCCACCGGCTTACTGGAGCTGCCCGGGG 
GTTTCACCGGGGGCATGGAACAGCTTCACATGGATGCCACAACCCAATTCACCATCTGGT 
TCCAATCCAAATTCTCCTAC^CTAGGTAAACATTCACGTGACGAGAACGCTGCTGAACCA 
GGAACCGCTTTTGATGAAACCGAGTCACTTGGTAGGGAGAAAAGC^AACCCGAGAGATGC 
TTGTGGGTTCCCAAGACGCTGAGGATTGATGATCCAGAGGAAGCTGCTAAAAGTTCCATC 
TGGGAAACATTAGGGATCAAAAAAGACGAAAATGCGGATACTTT 

TCAACCAAAGAAAAAAGCAGTCTTTCTGAAGGAAGACTTCCGGGAAGAAGACCGGAGTTG 
. CAAGCGAATCCTGCTGCTCTTTCTAGGTCAGCAAACTTCCATGAGAGCTCATAG 
>G550 Amino Acid Sequence (domain in AA coordinates: 134-180) 
MADPMKLFGKTIPLPEIjGVVDSSSSYTGFLTETQIPVRLSDSCTGDDDDEEMGDSGLGR 
EEGDDVGDGGGESETDKKEEKDSECQEESLRl^SNDVTTTTSGITEKTETTKAAJCTNEES 
GGTACSQEGKLKKPDKILPCPRC^SMETKFCYYNNYIJVNQPRHFCKKCQRYWTAGGTM^ 
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VPVGAGRRKNK^PASHYNRHVSITSAEAMQKVARTDLQHPNGAl^LTFGSDSVLCESMAS 
GLNLVEKSLLKTQTVLQEPNEGLKITVPIiNQTNEK 

GVSWTILPFYPPPAYWSCPGVSPGAWNSFTWMPQPNSPSGSNPNSPTLGKHSRDENAAEP 
GTAFDETESLGREKSKPERCIiWPKTLRIDDPEEAAKSSIWETLGIKKDENADTFGAFRS 
STKEKSSDSEGRIiPGRRPELQANPAAIjSRSANFHESS * 
>G670 (28.. 1152) 

CACAGCATTGCAGCTGTGAATAACTAAATGGGGAGACATTCTTGCTGTTACAAACAAAAG 
CTGAGGAAAGGGCTTTGGTCTCCTGAAGAAGACGAGAAGCTTCTTACTCACATCACCAAT 
CACGGCCATGGCTGCTGGAGOTCTGTCCCTAAACTCGCTGGTTTGCAGAGATGTGGGAAG 
AGTTGTCGACTCGAGC1AGATCTGGTACCGCCX3ACTAAGATGGATCAATTACTTGAGACCT 
GATTTAAAGAGAGGAGCTTTTTCTCCTGAAGAAGAGAATCTCATCGTCGAACTTCATGCC 
GTCCTTGGAAACAGATGGTCACAGATTGCGTCAAGGCTTCCGGGTAGAACCGACAACGAG 
ATCAAGAATCTATGGAACTCAAGCATCAAGAAGAAACTGAAACAAAGAGGCATTGACCCA 
AACACACACAAGC C CATCT CTG AAGTGG AG AGTTTT AG CG ACAAAG ACAAAC C AACAACA 
AGC7^CAACAAAAGAAGCGGTAACGATCACAAGTCTCCTAGTTCCTCTTCTGCGACTAAC 
CAAGACTTCTTCCTCGAAAGGCCATCTGATTTATCCGACTACTTCGGATTTCAGAAGCTT 
AACTTCAACTCCAATCTAGGACTCTCTGTTACAACTGATTCTTCIACTCTGCTCGATGATT 
CCGCCGC^GTTTAGCCCCGGGAACATGGTTGGTTCTGTCCTTCAGACACCAGTATGCGTA 
AAGCCCTCGATTAGTCTTCCTCCCGACAACAACAGTTCGAGTCCTATCTCCGGAGGAGAT 
CATGTGAAATTGGCTGCACCAAACTGGGAATTTCAGACAAACAACAATAATACCTCAAAT 
TTCTTCGACAATGGCGGATTCTCATGGTCTATCCCAAATTCTTCTACTTCTTCTTCACAA 
GTCAAACCAAATCATAACTTCGAAGAAATAAAATGGTCAGAGTATTTGAACACACCGTTC 
TTCATAGGGAGTACTGTACAGAGTGAAACCTCTCAACCAAT 

GATTACTTAGCCAATGTTTCAAACATGACAGATCCTTGGAGCCAAAACGAGAACTTO 

ACAACTGAAACTAGTGACGTGTTCTCCAAGGATCTTCAGAGAATGGCCGTCTCTTTTGGT 

CAGTCCCTTTAGCTTTTTTCTTTCTTTCTTTCTTATTTCTAACAGATGTAGAGAACATAA 

AGATATACAAATACATACAATGTCAATACGTACAGTGGATTTAAGTGTTCTGTATATTTC 

ATGGGCGAGCTGTCTTTATTTTTATGTTTAAAAAAAAAAAAAAAAAA 

>G670 Amino Acid Sequence (domain in AA coordinates: 14-122) 

MGRHSCCYKQKLRKGIiWSPEEDEKLLTHITNHGHGCWSSVPKIiAGLQRCGKSCRLEQIOT 

RRLRWINYIiRPDIjKRGAFS PEEENL I VELHAVTjGNRWS QI ASRIiPGRTDNE I KNL WNS S I 

KKKLKQRGIDPNTHKPISEVESFSDKDKPTTSimKRSGNDHKSPSSSSATNQDFFLERPS 

DLSDYFGFQKLNFNSTSTLGLSVTTDSSLCSM 

NNSSSPISGGDHVKIiAAPNWEFQTON^^ 

IKWSEYLNTPFFIGSWQSQTSQPIYIKSETDYL^^ 

KDLQRMAVSFGQSL* 

>G760 (175.. 1878) 

TGCTTAATTCCAATGCCATCGTGATCGATTCATCTCTCTCTCTCTCTTCCAATTTTCCCA 
ATTCTTTTTTAAAACCCTAATTTTTCAGATATCTGATTATCTCTTGTATTTCTTCTACTC 
GATTTGCTCCCATAAAAACCCTTACTTTCTTCAAGTTCTGGTTTTCACCGATTGATGGGT 
CGTGGCTCAGTGACGTCGCTTGCTCCTGGGTTCCGTTTTCACCCGACGGATGAGGAACTT 
GTTCGCTACTACCTTAAGCGTAAGGTCTGC^UVCAAACCCTTTAAGTTCGATGCTATTTCC 
GTCACCGACATATACAAGTCTGAGCCTTGGGATCTACCAGATAAGTCGAAGCTGAAAAGT 
AGAGACTTGGAATGGTACTTCTTTAGTATGCTGGATAAGAAGTACAGTAATGGTTCCAAG 
ACGAATCGTGCTACGGAGAAAGGGTATTGGAAGACGACTGGGAAAGATCGGGAGATTCGT 
AATGGTTCAAGAGTCGTTGGGATGAAGAAGACACTTGTTTATCACAAGGGTCGAGCTCCT 
CGTGGTGAAAGGACCAATTGGGTTATGCATGAGTATCGGCTTTCTGATGAGGACTTGAAG 
AAAGCTGGTGTGCGACAAGAAGCATATGTGTTATGTAGGATATTCCAGAAAAGTGGTACG 
GGTCCTAAGAATGGGGAGCAGTATGGTGCTCCTTATCTTGAGGAGGAGTGGGAAGAAGAT 
GGAATGACTTATGTAeCTGCTCAAGATGCTTTCAGTGAAGGATTGGCTTTGAATGATGAT 
GTTTATGTCGATATTGATGACATTGACGAGAAGCCCGAAAATCTGGTGGTCTATGATGCC 
GTTCCTATTCTACCTAACTATTGTCATGGGGAATCAAGTAACAATGTTGAATCAGGCAAT 
TACTCAGACTCTGGAAATTACATTCAACCAGGAAACAATGTTGTCGACTCTGGTGGGTAC 
TTTGAACAACCAATTGAAACTTTTGAGGAAGATCG^^ 

ATTCAGCCTTGTTCTCTGTTTCCAGAGGAACAAATTGGCTGTGGTGTGCAAGACGAAAAT 
GTGGTGAATCTGGAATCTTCCAACAATAATGTGTTTGTAGCTGATACATGCTACAGTGAC 
ATTCCTATTGATCATAACTATTTACCGGATGAGCCATTCATGGATCCTAATAACAATCTT 
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CCACTCAACGATGGTCTGTACCTGGAAACGAATGATCTCAGCTGTGCTCAACAAGATGAT 

TTTAACTTCGAAGATTATCTCAGCTTCTTTGATGATGAGGGTTTGACTTTTGACGATTCT 

CTATTAATGGGACCTGAAGATTTTCTTCCCAACCAAGAAGCCCTTGACCAGAAACCTGCC 

CCTAAAGAATTGGAGAAGGAGGTCGCAGGAGGCAAAGAGGCAGTGGAGGAAAAGGAAAGT 

GGCGAAGGATCTTCTTCAAAACAAGATACAGATTTCAAGGACTTTGATTC^GCTCCGAA^ 

TACCCATTTCTCAAAAAGACGAGCC^CATGCTTGGAGCCATTCCTACTCCATCTTCATTT 

GCTTCACAGTTCCAAACAAAGGACGCAATGCGTCTACACGCAGCACAATCTTCTGGTTCA 

GTTCACGTGACTGCAGGTATGATGAGAATATCAAACATGACTCTAGCAGCGGACAGCGGT 

ATGGGCTGGTCATATGACAAGAACGGTAACCTCAACGTAGTCCTTTCATTCGGGGTAGTC 

CAACAGGATGATGCGATGACTGCCTCGGGAAGCAAGACAGGAATTACGGCGACAAGAGCT 

ATGTTAGTCTTCATGTGTTTATGGGTTCTCCTACTCTCTGTTAGCTTCAAAATAGTAACC 

ATGGTGTCTGCTCGGTAATAGGATCAAAGl^GAATCGTCTCAAAGACTTTTTTTGGTGTT 

TGTACCTCTCCAATCATATAGCCTTTAACTTTGGCAGTGCTTTGCTGCTCAAT 

TTTTAAAAAAAAAAAAAAAAA 

>G760 Amino Acid Sequence (domain in AA coordinates: 12-156) 

MGRGSVTSLAPGFRFHPTDEELVRYYLKRK^ 

KSRDLEWYFFSMLDKKYSNGSKTNRATEKGY^ 

APRGERTNWVMHEYRLSDEDLKKAGVPQEAYVIjCRIFQKSGTGPKKGEQYGAPYIjEEEWE 
EDGMTYVPAQDAFSEGLAIiNDDVYVD IDD IDEKPENLWYDAVP I LPNYCHGES SNNVE S 
GNYSDSGNYIQPGNNWDSGGYFEQPIETFEEDRKPIIREGSIQPCSLFPEEQIGCGVQD 
ENVVl^ESSNlSINVWADTCYSDIPIDHNYIiPDEPFm 

DDFNFEDYLSFFDDEGLTFDDSLLMGPEDFLPNQEALDQKPAPKEIjEKEVAGGKEAVEEK 
ESGEGSSSKQDTDFKDFDSAPKYPFLKKTSHMLGAIPTPSSFASQFQTKDAMRIiHAAQSS 
GS VHVTAGMMRI SNMTLAADS GMGWS YDKNGNLNWLS FG WQQDDAMTAS G S KTG I TAT 
RAMIiVFMCLWVLLLSVSFKIVTMVSAR* 
>G831 (92.. 1987) 

TTCTTTCATCGTTGTGTCTATTATAAATATATGTCAATTT 

ATTGATTGATTGATTTTTTTTTCTTTAAGAGATGAATTTATTTACAAGAATCTCATCT^ 

GACTAAGAAGGCCAATCTTTACTACGTAACCCTAGTTGCTCTTCTCTGCATCGCTAGCTA 

CCTTCTCGGTATTTGGCAAAACACGGCGGTTAATCCACGCGCCGCCTTCGATGATTGAG 

CGGTACACCGTGCGAGGGATTCACCAGACCTAATTCTACGAAAGATCTCGACTTCGACGC 

GCATCACAACATTCAAGATCCACCTCCGGTGACGGAAACCGCCGTTAGTTTCCCGTCGTG 

TGCCGCCGCGTTGAGCGAGCACACGCCATGCGAAGACGCGAAGCGATCGTTGAAATTCTC 

GAGGGAGAGATTGGAGTATAGGCAAAGGCATTGTCCCGAGAGAGAAGAAATCTTGAAGTG 

CAGAATTCCGGCGCCGTACGGTTACAAAACGCCGTTCCGATGGCCGGCGAGTCGTGACGT 

GGCGTGGTTCGCTAATGTGCCTCACACGGAGCTTACGGTTGAGAAAAAGAATC^GAATTG 

GGTCCGGTACGAGAATGATCGGTTTTGGTTCCCTGGTGGAGGTACGATGTTTCCACGTGG 

CGCTGATGCTTACATTGATGATATCGGACGGTTGATTGATCTCAGCGACGGCTCTATCCG 

TACAGCCATCGATACCGGTTGCGGGGTGGCTAGCTTCGGTGCATATCTTTTATCAAGAAA 

CATTACAACGATOTCATTTGCACCAAGAGACACACACGAAGCTCAAGTCCAGTTCG 

CGAGCGTGGTGTGCCGGCGATGATCGGAATCATGGCTACAATCCGCCTACCGTACCCTTC 

TAGAGCCTTTGATTTAGCACATTGCTCTCGTTGCCTTATTCCGTGGGGCCAAAACGATGG 

GGCTTACTTGATGGAGGTGGATAGGGTTTTAAGACCAGGAGGGTACTGGATACTTTCTGG 

ACCGCCGATTAATTGGCAGAAACGGTGGAAAGGGTGGGAACGGACCATGGATGATTTGAA 

TGCAGAGCAGACTCAGATCGAGCAGGTCGCGAGAAGCTTGTGTTGGAAGAAAGTTGTTCA 

AAGAGATGATCTTGCTATTTGGCAAAAACCCTTTAACCACATTGACTGTAAGAAAACCAG 

AGAGGTTTTGAAAAATCCGGAGTTTTGTCGTCATGATCAAGATCCCGACATGGCCTGGTA 

TACGAAGATGGATTGTTGTTTGACACCATTACCTGAAGTTGATGACGCTGAGGATCTAAA 

GACGGTGGCCGGAGGGAAGGTAGAAAAGTGGCCGGCTAGATTAAACGCGATTCCTCCGAG 

AGTAAACAAAGGCGCTCTCGAGGAAATCACACCTGAAGCTTTCTTGGAGAACACGAAACT 

GTGGAAACAGAGAGTTTCTTATTACAAGAAGTTAGATTACCAGTTGGGTGAAACCGGGAG 

ATACAGAAACTTAGTCGACATGAACGCTTACCTCGGTGGATTCGCGGCGGCTCTAGCGGA 

TGATCCGGTCTGGGTCATGAACGTTGTCCCGGTCGAGGCTAAGCTCAATACGCTCGGTGT 

CATCTACGAGCGTGGTCTAATCGGAACGTATCAAAACTGGTGTGAAGCCATGTCGACGTA 

TCCAAGAACGTATGATTTTATCCATGCTGACTCGGTTTTCACA 

TGAACCGGAGGAGATATTGTTGGAGATGGACCGAATTCTTAGACCGGGTGGTGGTGTGAT 
TATAAGAGATGACGTGGACGTTTTGATCAAGGTTAAGGAATTAACCAAAGGATTAGAATG 
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GGAAGGTAGAATTGCTGACCACGAGAAGGGTCCTCATGAAAGAGAGAAGATTTACTATGC 
GGTGAAACAGTATTGGACCGTTCCTGCGCCTGATGAAGATAAAAACAACACTAGTGCTCT 
CTCCTGATTTTTGAGTTTTTTTTTTCTTACAATGTTTTTTTTTTTTTTT 
TATACAACAATAAATTCTCAATAATTGTTGTCGCGGCCG 

>G831 Amino Acid Sequence (domain in AA coordinates: 470-591) 
MNIiFTRISSRTKKAlTLYYVTLVAL^ 

NSTKDLDFDAHHNIQDPPPVTETAVSFPSCAAAIiSEHTPCEDAKRSLKFSRERLEYRQRH 
CPEREEILKCRIPAPYGYKTPFRWPASRDVAWFAim>HTELTVEKKNQNW 
PGGGTMFPRGADAYIDD IGRL IDLSDGS IRTAIDTGCGVAS FGAYIiLSRNITTMS FAPRD 
THEAQVQFAIiERGVPAMIGIMATIRLPYPSRAFDIiAHCSRCL^^ 
RPGGYWILSGPPINWQKRWKGWERTMDDIiNAEQTO 

FNHIDCKKTREVIjKNPEFCRHDQDPDMAWYTKMDSCLTPLPEVDDABDLKTVA 

PARIJ^AIPPRVNKGALEEITPEAFLENTKLWKQRVSYYKKl^ 

LGGFAAAIiADDPVVAmNVVPVEAKljOT 

SVFTLYQGQCEPEEILIiEMDRILRPGGGVIIRDDVDVIjIKVKEIiTKGLEWEGRIADHEKG 
PHEREKI YYAVKQYWTVP APDEDKNNTS ALS * 
>G864 (503.. 1534) 

TGCAAAAACATTTTCTTGTCTCTCCTCTGC^ 

CTAGAAAAACCCAAGCAAAiSCTTTAACCCCTTCCTCCTCCAAAAGTAGCATCTTCCTCTT 
TTTCTATTTCTCCTTTCCTCOTCTTATCTCTCTCTCGTTTGTGAACGATTCCTTAAGAAT 
ATAACCAAAAGCCCTTTTCTCCTCT 

GTTTCTCTCTCGGCTCTCGCAGTGTTTTTCGGGCCTTTTGTTCTTTCTATAAAAAAAAAA 
TTCGCGTCCTTTAAGAAAACTTTTTCCACCTAGAGAAGAAGAAGAGTATCACTCTTGTTG 
TTCAAGTTTCTCTCTTTAATAATy^AATCCATCTTTATTCTTTGTCTTCTTTCCTTTTTGC 
TTTCCCTAATCTCTATGTTATAAAC^CACAGAGAGAAACAAAGTCACAGTCTCGAGTC?^A 
AAACAGAGAATACGAAAGAAAAATGGAAGCGGAGAAGAAAATGGTTCTACCGAGAATCAA 
ATTCACAGAGCACAAAACCAACACGACAACAAT 

AACCAGGATTCTTCGTATCTCAGTCACTGACCCAGACGCTACTGATTCCTCCAGTGACGA 
CGAAGAAGAAGAACATCAACGCTTTGTCTCTAAACGCCGTCGTGTTAAGAAGTTTGTCAA 
CGAAGTCTATCTCGATTCCGGTGCTGTTGTTACTGGTAGTTGTGGTCAAATGGAGTCGAA 
GAAGAGACAAAAGAGAGCGGTTAAATCGGAGTCTACTGTTTCTCCGGTTGTTTCAGCGAC 
GACGACTACGACGGGAGAGAAGAAGTTCCGAGGAGTGAGACAGCGTCCATGGGGAAAATG 
GGCGGCGGAGATAAGAGATCCGTTGAAACGTGTACGGCTCTGGTTAGGTACTTACAACAC 
GGCGGAAGAAGCTGCTATGGTTTACGATAACGCCGCTATTCAGCTTCGTGGTCCCGACGC 
TCTGACTAATTTCTCAGTCACTCCGACAACAGCGACGGAGAAGAAAGCCCCACCACCGTC 
TCCGGTGAAGAAGAAGAAGAAGAAAAACAACAAAAGCAAAAAATCCGTTACTGCTTCTTC 
CTCCATCAGCAGAAGCAGCAGCAACGATTGTCTCTGCTCTCCGGTGTCTGTTCTCCGATC 
TCCTTTCGCCGTCGACGAATTCTCCGGCATTTCTTCATCACCAGTCGCGGCCGTTGTAGT 
CAAGGAAGAGCCATCCATGACAACGGTATCTGAAACTTTCTCTGATTTCTCGGCGCCCTT 
GTTCTCAGATGATGACGTGTTCGATTTCCGGAGCTCAGTGGTTCCCGACTATCTCGGCGG 
CGATTTATTTGGGGAAGATCTATTCACGGCGGATATGTGTACGGATATGAACTTCGGATT 
CGATTTCGGATCCGGATTATCCAGCTGGCACATGGAGGACCATTTTCAAGATATCGGGGA 
TCTATTCGGGTCGGATCCTCTTTTAGCTGTTTAATAATATTTTAAATAAATAAATAGTTA 
TACCGGCCGTTACTAAACGGAACCGGAGAAAGTTTTGTATACCGGTGACATAAAATCTCG 
GTTATGTTCGTAATCTTTTTTTCTTTGTTATATATAAAAATATGAATGAAACTGAATTAA 
TGTAAGTTAATGGTGATAATTATTAACGTTTTAAGTTTTGAT^AAAAAAAAAAAAAAAAAA 
AAAAAAA 

>G864 Amino Acid Sequence (domain in AA coordinates: 119-186) 
MEAEKKMVLPRIKFTEHKTNTTTIVSELTNra 

FVSKRRRVKKFVNEVYLDSGAVVTGSCGQMESKKRQKRAVKSESW 

KFRGVRQRPWGKWAAE IRDPLKRVRLWLGTYlTrAEEAAMVYDNAAIQLRGPDALTNFSVT 
PTTATEKKAPPPSPVKKKKKKNNKSKKSVTASSSISRSSSlSro 

SGI S SSPVAAVVVKEEPSMTTVSETFSDFSAPIiFSDDDVFDFRS S WPDYLGGDLFGEDL 

FTADMCTDMNFGFDFGSGI1SSWHMEDHFQDIGDLFGSDPLI1AV* 

>G884 (31.. 1575) 

TTTTTTTTTGTTTGTTAATTTTGGGGATCGATGTCGGAAAAGGAAGAAGCTCCGTCGACA 
TCGAAGTCCACCGGAGCTCCGTCGCGTCCGACTTTATCTCTTCCTCCACGGCCGTTTAGT 
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GAGATGTTCTTTAACGGTGGCGTTGGATTCAGTCCTGGTCCGATGACTCTGGTCTCTAAT 

ATGTTCCCTGATTCCGATGAGTTTAGGTCTTTCTCTCAGCTTCTCGCTGGAGCCATGTCT 

TCTCCAGCGACTGCAGCTGCTGCTGCTGCTGCTGCGACGGCTAGTGATTACCAGAGACTT 

GGTGAAGGGACTAATAGCTCTAGTGGTGATGTTGACCCGAGATTCAAGCAAAACAGACCA 

ACCGGTTTGATGATTTCTCAATCTCAATCGCCGTCGATGTTCACCGTACCGCCTGGTTTA 

AGTCCAGCTATGTTGCTCGATTCACCAAGCTTTTTGGGTCTTTTCTCTCCCGTTCAGG^ 

TCATATGGAATGACACATCAGCAAGCTCTAGCTCAAG 

AATGCCAATATGCAACCACAAACAGAGTACCCTCCTCCCTCTCAAGTTCAATCATTTTC^ 
TCGGGTCAAGCGGAGATCCCGACCTCGGCTCC^CT 

GTAACCATCATAGAG CACAGGTCACAACAGCCTCTAAATGTTGACAAAC CAGCTGATGAT 
GGCTATAACTGGCGAAAATATGGGCAAAAGCAAGTTAAAGGTAGCGAGTTTCCZACGAAGC 
TATTACAAGTGTACTAATCCAGGATGTCCTGTCAAGAAGAAGGTTGAGAGATCTCTTGAT 
GGACAAGTAACGGAGATTATCTACAAAGGTCAGCACAATCATGAACCTCCTCAAAACACT 
AAGCGAGGTAACAAAGATAACACCGCGAATATAAATGGGAGTTCGATAAATAACAATCGC 
GGGAGTTCTGAATTGGGGGCATCACAGTTTCAAACTAATAGCTCCAACAAGACTAAGAGA 
GAGCAACATGAAGCAGTAAGTCAAGCTACGACAAC7VGAGCACTTGTCTGAGGCAAGTGAC 
GGTGAAGAAGTTGGTAATGGAGAAACTGATGTGAGAGAGAAAGATGAGAATGAGCCTGAT 
CCCAAGAGAAGAAGTACAGAAGTTCGGATTTCAGAACCAGCT 

ACTGTGACAGAGCCTAGAATTATTGTCCAAACGACGAGTGAAGTTGATCTTCTAGATGAT 

GGATATAGGTGGCGTAAATATGGACAGAAAGTTGTCAAAGGGAATCCTTATCCGAGGAGC 

TACTACAAGTGCAGAACACCAGGATGTGGTGTGAGGAAAGATGTAGAGAGAGCAGCAACA 

GATCCAAAAGCTGTAGTAACAACATATGAAGGAAAACATAACCATGACCTTCCCGCTGCT 

AAATCAAGCAGCCATGCCGCTGCAGCGGCA.CAGTTAAGGCCAGATAATCGACCTGGCGGT 

TTGGCTAACTTAAATCAACAGCAGCAGCAACAGCCCGTTGCGCGGCTAAGGCTTAAAGAA 

GAGCAAAGAACTTGAGAGAAGAAAACTCTTGACCGTTTTTC^ 

TCCACTCACACACTTGTCTGAAAAATCTAGCAGTTTGCAGGAAAGAAA 

GTTGTAGTTCTTCTATGTTCTGGTGTAAAACTTAAAAGCTTTTTAGGGTTTTCAGATTTC 

TGTTTACTAATACTGTATGTGAATTCTTTTGTA^^ 

TTTTGTGTTGTATCTTTTGTGTTATTGTTTCAGTAAAAGATAGGTCTTACATTTTGTGTA 
AAAAAAAAAAAAAAAAAAA 

>G884 Amino Acid Sequence (conserved domain in AA coordinates : 227-285 , 407-465) 

MSEKEEAPSTSKSTGAPSRPTLSIiPPRPFSEMFFNGGVGFSPGPMTLVSNMFPDSDEFRS 

FSQLLAGAMS S PATAAAAAAAATASDYQRLGEGTNSS SGD VDPRFKQNRPTGIiMISQSQS 

PSMFTVPPGLS PAMLLDS PS FLGLFS P VQG S YGMTHQQAIiAQVTAQAVQANANMQPQTE Y 

PPPSQVQSFSSGQAQIPTSAPLPAQRETSDVTIIEHRSQQPIjOTDKPAD^ 

QVKGSEFPRS YYKCTNPGCPVKKKVERSLDGQVTE 1 1 YKGQHNHEPPQNTKRGNKDNTAN 

INGSSINNimGSSELGASQFQTOSSNKT 

VIIEKDENEPDPKRRSTEWISEPAPAASHRTVTEPRIIV^^ 

VVKGNP YPRS YYKCTTPGCGVRKHV^RAATDPKAVvTTTYEGKHNHDL S SHAAAAA 

QLRPDNRPGGLANIjNQQQQQQPVARLRIiKEEQTT* 

>G898 (161.. 772) 

GAAAAAAAGATTCAAAAACCCTAGATTTCACAAAATCGATTGGCTGTGAAATTTCTGTCC 
GGCGATTTTCCTCGAGTGAAATTCGGCTCAAGGTGATTATAGCGATCATCGAATCAAATT 
GATTGAAGAGGTACAAAGGTTAGTTACTTTGAGCTGAAAGATGAACACGTCAGAGGTGAG 
AGTACCTCGAGGAAATCGACGGAGGAAAGCTGTGATTGATCTGAATGCGGTACCTGTTGA 
TCAAGAAGGGACCTCTGCTTCTGTTAGAACTCTTACGGTGCCTATTACACCGTCTCAGCC 
TGCTCCTACGATGATTGATGTCGATGCTATTGAGGATGATGTTATTGAATCATCCGCTAG 
TGCTTTTGCTGAAGeTAAAAGCAAATCAAGAAATGCACGTCGGAGACCTTTGATGGTTGA 
TGTAGAGTCAGGAGGTACGACTAGATTCCCTGCCAACATAAGCAACAAACGCAGAAGGAT 
TCCTTCTAGTGAATCTGTCATCGACTGTGAGCATGCCTCTGTAAATGATGAAGTCAACAT 
GTCTTCGAGAGTGTCTAGATCAAAGGCTCCAGCTCCTCCACCAGAAGAGCCAAAGTTTAC 
ATGTCCAATCTGCATGTGTCCCTTTACGGAGGAGATGTCAACCAAGTGCGGTCACATCTT 
CTGCAAGGGATGTATAAAGATGGCAATATCTCGCCAGGGCAAATGCCCTACTTGTAGGAA 
AAAGGTTACTGCAAAAGAGCTGATTCGAGTTTTCCTTCCAACCACTAGATGAGTGGTCCG 
GQ\ACATCACCAGCCACCCTGTCTAATGGTTTATCAGACTATCCTCCTATTCACTTTGGA 
ACATTGAAGGGACTTCGTTGACTTGGTATTTTTGAATATTTTGCTTTGTTGGAAGAGAAA 
TATTCAGTGATCAAGAAGCCAGAAGGCCCTATCATTCGATGGATATCATTGGTAATAACT 
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CTTTGTTTTTAGTTGTTGTTCTATGTAATTTAGGTCTCTGCAAACCTCT 

CTTCTCTCTTGATAGATGATAAGATATATGGAAAAAAAAATTAATATTGAATCTTTACTA 

AAA 

>G898 Amino Acid Sequence (domain in AA coordinates: 148-185) 
MNTSEVRVPRGNRRRKAVIDLNAVPVDQEGTSASW 

VIESSASAFAEAKSKSRNARRRPLMVDVESGGTTRFPANISNKRRRIPSSESVIDCEH^ 
VNDEV3MSSRVSRSKAPAPPPEEPKFTCPICMCPFTEEMSTKCGHIFCKGCIKMAISRQG 
KCPTCRKKVTAKELIRVFLPTTR* 
>G900 (1..648) 

ATGGGGAAGAAGAAGTGCGAGTTATGTTGTGGTGTAGCGAGAATGTATTGTGAGTCAGAT 
CAAGCGAGTTTATGTTGGGATTGTGACGGTAAAGTTCACGGAGCTAATTTTCTGGTGGCG 
AAACACATGCGTTGTCTTCTATGTAGCGCGTGTCAGTCACACACGCCTTGGAAAGCTTCT 
GGGCTGAATCTTGGCCCAACTGTTTCTATCTGTGAGTCTTGTTTAGCTCGTAAGAAGAAT 
AACAACAGCTCCCTCGCCGGGAGGGATCAGAATCTTAACCAAGAAGAAGAGATCATTGGT 
TGTAACGACGGAGCTGAGTCTTATGATGAGGAAAGCGATGAGGATGAAGAAGAAGAAGAA 
GTGGAGAATCAGGTTGTTCCGGCTGCGGTGGAGCAAGAACTTCCGGTGGTGAGTTCGTCG 
TCTTCGGTTAGTAGTGGTGAAGGAGATCAGGTGGTGAAAAGGACGAGACTTGATTTGGA^ 
CTTAACCTCTCCGATGAGGAGAACCAATCTAGACCATTGAAAAGATTATCGAGAGACGAA 
GGTTTGTCAAGATCAACTGTTGTGATGAATAGCTCAATCGTGAAATTACACGGAGGGAGG 
AGAAAAGCAGAGGGATGTGATACATCATCGTCGTCTTCGTTTTATTGA 

>G900 Amino Acid Sequence (domain in AA coordinates: 6-28, 48-74) 
MGKKKCELCCGVARMYCESDQASLCWDCDGKVHGAN^ 

GLNLGPTVS I CESCJjARKKNNNS S LAGRDQNLNQEEEI IGCNDGAES YDEESDEDEEEEE 
VENQWPAAVEQELPWSSSSSVSSGEGDQWKKTRLDLDL^ 
GLSRSTVVMNSSIVKLHGGRRKAEGCDTSSSSSFY* 
>G913 (108.. 806) 

CATTCAAAAACATCATATATATAGACAAACAC^ 

ACAAACAAAAACACATTGTAACATTAGTTTAAG CATTAAG CTTCTTTATGTCGAATAATA 
ATAATTGTCCGACCACCGTGAATCAAGAAACGAC(^CGTCTCGTGAAGTCTCAATCACAT 
TGCCTACTGATCAATCTCCTCAAACCTCACCAGGATC^TCTTCTTCTCCTTCACCGAGAC 
CTTCCGGTGGATCACCGGCGAGT^AGAACGGCGACTGGATTATCCGGCAAGCACTCTATTT 
TCAGGGGGATTCGACTACGTAACGGAAAATGGGTATCGGAGATTAGAGAGCCACGTAAAA 
CGACAAGAATTTGGCTCGGGACTTATCCGGTACCGGAGATGGCTGCCGCCGCTTACGACG 
TGGCTGCGTTAGCTTTAAAAGGACCCGACGCCGTTTTGAATTTTCCTGGTTTAGCTTTGA 
CTTACGTGGCTCCGGTTTCAAACTCTGCTGCGGATATAAGAGCGGCTGCTAGTAGAGCAG 
CGGAGATGAAGCAACCGGATCAGGGTGGGGATGAGAAGGTATTGGAACCGGTTCAACCCG 
GCAAAGAGGAAGAATTAGAAGAAGTGTCGTGTAACTCGTGTTCGTTGGAGTTTATGGATG 
AGGAAGCGATGTTGAATATGCCGACTTTGTTGACGGAGATGGCTGAAGGGATGTTGATGA 
GTCCACCGAGAATGATGATACATCCGACGATGGAAGATGATTCGCCGGAGAATCATGAAG 
GAGATAATCTTTGGAGTTATAAATGAATCCATTGAAGCTGCTCTCTTTTTTATTGTTTTC 

>G913 Amino Acid Sequence (domain in AA coordinates: 62-128) 
MSNNNNSPTTVNQETTTSREVS ITLPTDQS PQTSPGS SS SPS PRPSGGS PARRTATGLSG 
KHS I FRG I RLRNGKIWS EIREPRKTTRI WLGTYPVPEMAAAAYDVAAI*ALKGPDAVLNFP 
GLALTYVAPVSNSAADIRAAASRAAEMKQPDQGGDEKVLEPVQPGKEEELEEVSCNSCSL 
EFMDEEAMLNMPTIjLTEMAEGMIjMSPPRX4MIHPTMEDDSPENHEGDNIjWSYK* 
>G937 (45.. 1046) 

TGGAAAAAGTTTGA^TTTTTAATTCGAATCGAGAAAAAATAAAAATGGGTTCTXTAGGTG 
ATGAGCTTAGTTTGGGATCGATCTTTGGGAGAGGAGTTTCGATGAATGTTGTGGCGGTTG 
AGAAAGTTGATGAACATGTTAAGAAGCTTGAAGAAGAGAAGAGAAAGCTCGAAAGTTGTC 
AACTTGAGCTTCCTCTGTCTTTGCAGATTTTAAACGATGCGATTTTGTATCTGAAGGATA 
AGAGATGTTCAGAGATGGAGACTC^CC^TTGTTGAAAGATTTCATTTCTGTTAATAAAC 
CTATTCAAGGAGAAAGAGGAATAGAATTGCTGAAAAGAGAGGAGCTAATGAGGGAGAAGA 
AGTTT<^GCAATGGAAAGCTAATGATGATCA(^CTAGTAAGATCAAGAGCAAGCTTGAGA 
TTAAGAGAAATGAGGAGAAATCTCCTATGTTGTTGATTCCAAAGGTGGAAACTGGTTTAG 
GCCTCGGTTTAAGTTCGAGTTCGATAAGAAGAAAAGGGATTGTTGCCTC^TGTGGCTTTA 
CTTCTAACTCTATGCCACAACCACCAACACCAGCAGTACQ 
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AGCAGCAAGCTTTACGGAAGCAAAGAAGGTGTTGGAATCCAGAGTTGCATCGCCGATTTG 

TCGATGCATTGCAACAGCTAGGTGGACCGGGAGTGGCAACTCCTAAACAAATTAGAGAAC 

ATATGCAAGAAGAAGGCTTAACCAATGATGAAGTCAAGAGTCATTTACAGAAATACAGGT 

TACACATCAGGAAGCCAAATTCGAATGCGGAGAAACAATCAGCAGTTGTTTTAGGGTTTA 

ACTTGTGGAATTCTTCAGCACAAGATGAAGAAGAGACATGTGAAGGAGGAGAATCATTGA 

AGAGAAGCAATGCGCAATC^GATTCTCCTCAAGGTCCTTTGC^GTTACCGTCTACAACAA 

CAACAACTGGTGGAGATAGTAGCATGGAAGATGTTGAAGATGCTAAGTCTGAGAGCTTTC 

AACTGGAGAGATTGAGATCACCATAAATCTCAAGAAACCAAACTCTTGATGACGG 

TTATTTTGGATTCATTACTATATCTATTAGT^ 

TTTATAGATATATATATAGAGAAAAAGAGAGAGTGAGGATGGTTCAAATTATTTGCAGA 

>G937 Amino Acid Sequence (conserved domain in AA coordinates : 197-246) 

MGSLGDELSLGSIFGRGVSMNVVAVEKVDBHVT^ 

LYLKDKRCSEMETQPLIjKDFI SVNKPI QGERGIELIjI^ 

KSKLEIKIO^EKSPMIjIjIPKvBTGLGIjGLSSSSIRRKGIVASCGFTSNSMPQPPTPAVPQ 
QPAFLKQQAIjRKQRRCVTOPELHRRFVDAIjQQLGGPGVATPK^ 

LQKYRLHIRKPNSNAEKQSAV^GFNLWNSSAQDEEETCEGGESLKRSNAQSDSPQGPLQ 

LPSTTTTTGGDSSMEDVEDAKSESFQLERLRSP* 

>G960 (63.. 1538) 

TACCGTCGACCCACGCGTCCGAGTGTATTCAAAGTCGGA7^AGAAACCCTAAAGAAGAGGA 

TTATGGGTGCTGTATCGATGGAGTCGCTTCCTTTAGGTTTCAGATTCAGACCTACCGATG 

AAGAGCTCGTCAATCACTACCTCCGTCTCAAGATCAACGGACGTCACTCCGATGTCCGTG 

TCATCCCTGATATCGATGTCTGCAAATGGGAACCTTGGGATCTTCCTGCTCTCTCGGTGA 

TTAAGACGGATGATCCAGAGTGGTTCTTTTTCTGCCCTCGTGATCGGAAATACCCTAATG 

GTCATCGCTCTAACAGAGCAACTGACTCTGGCTATTGGAAAGCTACTGGTAAAGATCGTA 

GCATCAAGTCTAAGAAGACTTTAATCGGTATGAAGAAGACTCTTGTCTTCTATCGTGGAC 

GAGCTCCTAAAGGTGAGCGGACTAATTGGATTATGCACGAGTATCGTCCCACTCTTAAGG 

ATCTTGATGGGACTTCCCCTGGCCAAAGCCCTTACGTTCTTTGTCGCCTCTTCCACAAGC 

CTGATGATCGGGTTAATGGTGTCAAGTCCGATGAAGGAGCTTTTACGGCCAGCAACAAAT 

ACTC^CCTGATGATACATCATCTGATCTTGTTCAAGAAACACCTTCCTCTGATGCTGCTG 

TTGAGAAACCATCAGATTATTCAGGTGGATGCGGTTATGCTCATAGTAATAGTACCGCAG 

ATGGGACAATGATTGAGGCACCTGAAGAGAATCTTTGGTTATCTTGTGACCTTGAAGATC 

AAAAGGCACCACTACCGTGTATGGATTCTATATATGCTGGTGATTTCAGTTACGATGAGA 

TTGGATTCCAATTTCAAGATGGTACCAGCGAACCAGATGTATCACTAACAGAATTGTTGG 

AGGAGGTGTTCAATAACCCTGATGACTTCTCTTGCGAGGAATCGATCAGTCGAGAGAATC 

CAGCAGTCTCACCAAATGGGATATTTTCATCTGCTAAAATGCTGCAGTCTGCAGCACCAG 

AGGATGCTTTCTTCAACGACTTCATGGCTTTCACTGATACAGATGCTGAGATGGCGCAAT 

TGCAGTATGGTTCAGAAGGTGGAGCTTCTGGTTGGCCAAGTGACACTAATTCATACTATA 

GTGATTTGGTTCAGCAAGAGCAAATGATCAATCATAACACAGAGAACAACCTCACAGAAG 

GGAGAGGGATAAAGATCCGGGCTCGACAGCCTCAGAACCGGCAGAGTACAGGATTGATAA 

ACCAGGGTATTGCTCCAAGGAGAATCCGTCTGCAGCTGCAGTCTAACTCTGAAGTAAAAG 

AACGAGAGGAGGTGAATGAAGGACACACTGTTATTCCCGAGGCCAAAGAAGCTGCAGCTA 

AATACTCAGAGAAGAGTGGTTCTTTGGTTAAACCTCAAATAAAGCTCAGGGCGCGGGGAA 

CTATAGGCCAAGTAAAAGGAGAGAGATTTGCAGACGACGAGGTACAGGTGCAGAGCACAA 

AGAGAGAGAGAGAGAGAATCAAATGTAGTTTAATGTAATTAGGGATGATGCAATGTTAGC 

ATGTTTGTGTGTTGTAACTTAAAAACTTATTTAGGAATCTGATAAAAGTTACTGTTGAAA 

AAAGAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

>G960 Amino Acid Sequence (domain in AA coordinates: 13-156) 

MGAVSMESLPLGFRFRPTDEELVNHYIjRLKINGRHSDVRVIPDIDVCKWEPWDLPALSVI 

KTDDPEWFFFCPRDRKYPNGHRSNRATDS GYWKATGKDRS I KS KKTL I GMKKTLVFYRGR 

APKGERTNWI MHE YRPTLKDLDGTS PGQS P YVLCRLFHKPDDRVNGVKSDEAAFTASNKY 

SPDDTSSDLVQETPSSDAAVEKPSDYSGGCGYAHSNSTADGTMIEAPEENLWLSCDIiEDQ 

KAPLPC^SIYAGDFSYDEIGFQFQDGTSEPDVSLTELLEEVTTSTNPDDFSCEESISR 

AVS PNGI FS S AKMIiQS AAPEDAF FNDFMAFTDTDAEMAQLQYGSEGGASGWP SDTNS YYS 

DLVQQEQMINHNTENNLTEGRGIKIRARQPQNRQSTGLINQGIAPRRIRIjQLQSNSEVKE 

REEVNEGHTVIPEAKEAAAKYSEKSGSLVKPQIKLRARGTIGQVKGERFADDEVQVQSTK 

RERERI KCSLM* 

>G991 (6.. 533) 
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GAAAAATGGAAGAAGAAAAGAGATTGGAGCTAAGGCTAGCTCCTCCTTGTCACCAATTCA 

CTTCCAACAACAACATC^TGGATCTAAACAAAAAAGCTCGACCAAAGAAACATCATTCC 

TTTCCAATAACAGGGTTGAGGTAGCTCCAGTGGTGGGATGGCCGCCGGTGAGATCATCCC 

GGAGAAACCTAACGGCACAACTAAAGGAGGAGATGAAGAAGAAGGAGAGTGATGAAGAGA 

AGGAATTGTACGTTAAGATCAACATGGAAGGAGTTCCAATAGGAAGAAAAGTCAACCTTT 

CAGCTTATAACAACTACCAACAGCTTTGACATGCCGTTGACCAACTCTTOT 

ATTCGTGGGATCTAAACAGACAATACACTTTGGTCTACGAAGACACTGAAGGAGATAAAG 

TTCTGGTCGGGGATGTTCCTTGGGAGATGTTTGTATCTACTGTAAAGAGGTTGCATGTTT 

TAAAGACCTCCCACGCCTTCTCACTCTCACCTAGAAAACATGGCAAGGAATAGAGAGAGG 

TTGGCCAAAATCATCAGTTCGATGGTTTGTTTTTAATGTAATTTTTGTG 

GGTTTGGCTTTGATTTACTGGTTTTCTTTTTCACTTATGTACTAGGTTTTTGCTTGCTAT 

GTTATTTCTTGTTTTGGTTGTAAATATGCTGTTCGTTTAAGAAATCGGGGGTTAGTATGT 

TATCGTGTGTATAAAAATAGTGTAAGGACGTAAGTTGATTACAAAAAAAAAAAAAAAAAA 

AAAAAAAAA 

>G991 Amino Acid Sequence (domain in AA coordinates: 7-14,48-59,82-115,128-164) 

MEEEKRLELRLAPPCHQFTSNNNIN^ 

NIjTAQIiKEEMKKKESDEEKELYVKINM 

WDLNRQYTLVYEDTEGDKVLVGDVPWEMFVSTVKRLHVLKTSHAFSIiSPRKHGKE* 
>G748 (98 . .1444) 

CCACGCGTCCGCACTCTCCCAAATCTCTCTTCTTTAACAACAAAAAAAAAATCACAGAGA 
CATAGAGAGAAGAAGACGGAACAGAGGCTCCAAAAAAATGATGATGGAGACTAGAGATCC 
AGCTATTAAGCTTTTCGGTATGAAAATCCCTTTTCCGTCGGTTTTTGAATCGGCAGTTAC 
GGTGGAGGATGACGAAGAAGATGACTGGAGCGGCGGAGATGACAAATCACCAGAGAAGGT 
AACTCCAGAGCTATCAGATAAGAACAACAACAACTGTA^^ 

GAAACCCGAAACCTTGGACAAAGAGGAAGCGACATCAACTGATCAGATAGAGAGTAGTGA 
CACGCCTGAGGATAATCAGC^GACGACACCTGATGGTAAAACCCTAAAGAAACCGACTAA 
GATTCTACCGTGTCCGAGATGCAAAAGCATGGAGACCAAGTTCTGTTATTACAACAACTA 
CAACATAAACCAGCCTCGTCATTTCTGCAAGGCTTGTCAGAGATATTGGACTGCTGGAGG 
GACTATGAGGAATGTTCCTGTGGGGGCAGGACGTCGTAAGAACAAAAGCTCATCTTCTCA 
TTACCGTCACATCACTATTTCCGAGGCTCTTGAGGCTGCGAGGCTTGACCCGGGCTTACA 
GGCAAACACAAGGGTCTTGAGTTTTGGTCTCGAAGCTCAGCAGGAGGACGTTGCTGCTCC 
CATGACACCTGTTATGAAGCTACAAGAAGATCAAAAGGTCTCAAACGGTGCTAGGAACAG 
GTTTCACGGGTTAGCGGATCAACGGCTTGTAGCTCGGGTAGAGAATGGAGATGATTGCTC 
AAGCGGATCCTCTGTGACCACCTCTAACAATCACTCAGTGGATGAATCAAGAGCACAAAG 
CGGCAGTGTTGTTGAAGCACAAATGAACAACAACAACAACAATAACATGAATGGTTATGC 
TTGCATCCCAGGTGTTCCATGGCCTTACACGTGGAATCCAGCGATGCCTCCACCAGGTTT 
TTACCCGCCTCCAGGGTATCCAATGCCGTTTTACCCTTACTGGACCATCCCAATGCTACC 
ACCGCATCAATCCTCATCGCCTATAAGCCAAAAGTGTTCAAATACAAACTCTCCGACTCT 
CGGAAAGCATCCGAGAGATGAAGGATCATCGAAAAAGGACAATGAGACAGAGCGAAAACA 
GAAGGCCGGGTGCGTTCTGGTCCCGAAAACGTTGAGAATAGATGATCCTAACGAAGCAGC 
AAAGAGCTCGATATGGACAACATTGGGAATCAAGAACGAGGCGATGTGCAAAGCCGGTGG 
TATGTTCAAAGGGTTTGATCATAAGACAAAGATGTATAACAACGACAAAGCTGAGAACTC 
CCCTGTTCTTTCTGCTAACCCTGCTGCTCTATCAAGATCACACAATTTCCATGAACAGAT 
TTAGAGTTACATATGTATATGTATATATGTATGATTGATTGTATGTATAGATGATACTGG 
AGAATGATGAGTTTTTGAGAATCAAACTCTTTTCTTCTTTCTAGTGATTGCCTTTATTCC 
TTTAGATGTTTTGGTTCTCTGTACACTATTTGATTTACCTTTTTTACTTTCTTTCTTCAT 
TTGTCAGGAAATGTTGGAAGATAACATTAATGGTAAAAAGTTGGTGTGGACCGTTGTTGC 
GTTGGCATTTCAAAAAAAAAAAAAAAA 

>G748 Amino Acid Sequence (domain in AA coordinates: 112-140) 
MMMETRDPAIKLFGMKIPFPSVFESAVTVEDDEEDDWSGG 

NDNS FNNS KPETLDKEEATSTDQ I E S SDTPEDNQQTTPDGKTIjKKPTKI L PCPRCKSMET 
KFCYYNNYNINQPRHFCKACQRYWTAGGTMRNVPVGAGRRKNKS S S SHYRHIT I SEALEA 
ARLDPGLQAOTRVIiSFGLEAQQQHVAAPMTPVMKLQEDQKVSNGARl^ 
VENGDDCSSGSSVTTSNlfflSVDBSRAQSGSVVEAQlV^^ 

PAMPPPGFYPPPGYPMPFYPYWTIPMLPPHQSSSPISQKCSNTNSPTLGKHPRDEGSSKK 

DNETERKQKAGCVLVPKTLRIDDPNEAAKSSIWT^^ 

NNDKAENS PVLS ANPAAIiSRSHNFHEQ I * 
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>G247 (1..660) 

ATGAGAATGACAAGAGATGGAAAAGAACATGAATACAAGAAAGGTTTATGGACAGTGGAA 
GAAGACAAGATCCTCATGGATTATGTCCGAACTCATGGCCAGGGCCACTGGAACCGCATC 
GCCAAGAAAACTGGGCTCAAGAGATGTGGGAAAAGCTGTAGGTTGAGATGGATGAACTAC 
TTAAGCCCTAATGTTAACAGAGGCAATTTTACTGACCAAGAAGAAGATCTCATCATCAGA 
CTCCACAAGCTCCTCGGCAACAGATGGTCGTTGATAGCGAAAAGAGTTCCGGGAAGAACA 
GACAACCAAGTAAAGAATTACTGGAACACACATCTCAGCAAGAAACTTGGTCTCGGAGAT 
GATTCAACTGCCGTCAAAGCCGCATGCGGTGTAGAGTCTCCACCGTCTATGGCCCTTATA 
ACCACAACGTCCTCCTCTCATCAAGAGATCTCCGGTGGAAAAAATTCAACTCTAAGGTTC 
GACACTTTAGTTGACGAATCCAAACTCAAACGZUU^TCCA 

ACTGACGTAGAAGTTGCAGCTACGGTTCCAAATCTGTTCGATACCTTTTGGGTTCTTGAA 
GACGACTTCGAGCTTAGTTCACTCACTATGATGGATTTTACTAATGGGTATTGCCTTTGA 
>G247 Amino Acid Sequence (domain in AA coordinates: 15-116) 
Ml^TRDGKEHEYKKGLWTVTSEDKI 

LSPlTTORGNFTDQEEDLIIRIiHKLIiGimWSLIAKRVPGRTDNQV^^ 

HSTAVKAACGVESPPSMAIjITTTSSSHQEISGGKNSTLRFDTIjVDESKLKPKSKIjVHATP 

TDVEVAATVP^FDTFWVLEDDFELSSIjTMMDFTNGYCL* 

>G585 (111.. 2039) 

CTCTCAAACATTTCTCTGTTTGTTCCGGCGAA^^ 

cac^aaaaccttaacatctagtttgtatcctctctgatacttc^aaaa 

aaacaatggctaccggacaaaacagaacaactgtgccagagaatc 

c^gtttc^gttcgaaacattcaatggagttatggtatcttttggtctgtctctgcttctc 

agtctggagttttagaatggggagatggatactataatggagatatcaaaacgaggaaga 

cgattcaagcttcggagatcaaagctgatcagcttggtctacggaggagcgag c ag ctta 

gcgagctttacgagtctctctccgtcgctgaatcttcttcttcaggcgttgctgccggat 

ctcaagtcaccagacgagcttccgccgccgcactttcaccggaagatctcgccgacaccg 

agtggtactatttggtttgtatgtctttcgtcttcaacattggtgaaggaatgcctggac 

ggacgtttgcaaacggtgaaccgatatggttgtgcaacgctcatacggcggatagtaaag 

tgtttagccgttctcttctagcaaaaagtgctgcggttaagacagtggtttgcttcccgt 

tccttggaggagtcgttgagattggtaccacagaacatattacggaagacatgaatgtaa 

tacaatgcgtgaagacatcattcctcgaagcccctgatccgtacgctacaatattaccag 

caagatccgattatcacatcgagaacgttcttgatccgcaacagattctaggcgacgaga 

TTTACGCGCCTATGTTCAGTACGGAGCCTTTTCCAACAGCTTCTCCGAGCAGAACTACCA 
ACGGTTTCGATCAAGAACATGAACAAGTAGCAGATGATCATGATTCTTTCATGACCGAAA 
GAATCACTGGAGGAGCTTCTCAGGTGCAAAGCTGGCAGCTCATGGACGACGAGCTTAGTA 
ACTGCGTTGACCAGTCGCTAAATTCCAGCGATTGCGTCTCTCAAACGTTTGTTGAAGGGG 
CGGCTGGACGGGTTGCTTACGGTGCAAGAAAGAGTAGAGTTCAAAGACTAGGGCAAATTC 
AAGAGCAACAGAGAAATGTGAAGACATTGTCATTTGATCCAAGAAACGACGACGTTCATT 
ACCAAAGTGTGATCTCAACGATTTTTAAGACCAACCATCAGTTAATTCTCGGACCGCAGT 
TTCGAAACTGCGATAAACAGTCAAGCTTCACTAGGTGGAAGAAATCATCGTCATCATCAT 
CAGGAACCGCCACGGTCACGGCACCATCACAAGGAATGTTAAAGAAAATTATTTTCGATG 
TTCCGCGAGTGCACCAGAAAGAGAAGTTAATGTTGGACTCACCAGAAGCCAGAGATGAAA 
CTGGGAACCATGCGGTTTTAGAGAAGAAGCGCCGCGAGAAATTGAACGAACGGTTCATGA 
CCTTGAGAAAAATCATTCCGTCAATCAACAAGATCGATAAAGTATCGATTCTTGACGATA 
CGATAGAGTATCTTCAAGAACTCGAGAGACGGGTTCAAGAACTAGAATCTTGCAGAGAAT 
CAACCGATACAGAGACTCGTGGGACGATGACGATGAAGAGGAAGAAACCATGCGACGCAG 
GAGAAAGAACATCAGCTAATTGCGCAAATAATGAAACAGGAT^ATGGGAAGAAGGTGTCGG 
TTAACAATGTTGGTCAAGCCGAGCCAGCAGATACCGGTTTTACTGGTTTAACCGATAATT 
TAAGGATCGGTTCGTTTGGTAATGAGGTGGTTATTGAGCTTAGATGTGCTTGGAGAGAAG 
GAGTATTGCTTGAGATAATGGATGTGATTAGTGATCTCCATTTGGATTCTCATTCGGTTC 
AATCCTCGACCGGAGACGGTTTGCTCTGCTTAACCGTCAATTGCAAGCACAAGGGGTCAA 
AAATAGCGAGACCAGGAATGATCAAAGAAGCACTTCAAAGGGTTGCATGGATCTGTTGAA 
GACTACTTAGTTAAAATTGACAGCAAAGAAAAAACATTCCCGGTTTGGTTTCTATTCTTT 
GGTTTTCTTCTAACCGGGTTTTAGGAATTAATGTTATGTTTATCATTTGTTTTTTGTTTT 
TTTTTTGTGTCTTTTTTTCCGTTGCTTAACGTAGGTGAAGAGGAACATACACTATGCGTA 
TTTTGTTTGAGGTAGATTATTTTAAGGGTATTAGTAATAGTAATAGCCAGTTTAGATGAT 
TTTGTGTTCTTTTGTTGTT 
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>G585 Amino Acid Sequence (domain in AA coordinates : 436-501) 

^EETMATGQNRTTVPENIjKKHLAVS VRNI QWS YG I FWS VS AS QSG VLE WGDGYYNGD I K 

TRKTIQASEIKADQLGLRRSEQLSELYESLSVAESSSSGVAAGSQVTRRASAAALSPEDL 

ADTEWYYIiVCMSFWNIGEGMPGRTFANGEPIWLCNAHTADSKVFSRSLLAK^ 

CFPFLGGVVEIGTTEHITEDMNVIQC^KTSFLEAPDPYATILPARSDYHIDNVLDPQQI^ 

GDEIYAPMFSTEPFPTASPSRTTNGFDQEHEQVADDHDSFMTERITGGASQVQSWQLMDD 

ELSNCVHQSIjNS SDCVS QTFVEGAAGRVAYGARKSRVQRIjGQI QEQQRNVKTLS FDPRND 

D VHYQS VI ST I FKTNHQIjI LGPQFRNCDKQS S FTRWKKS S SS S SGTATVTAPS QGMLKKI 

I FDVPRVHQKEKLMLDSPEARDETGNHAVLEKXRR^ IPS INKIDKVS I 

LDDTIEYIiQELERRVQELESCRESTDTETRGTMTMKRKKPCDAGERTS^ 

KVSVNNVGEAEPADTGFTGLTDNLRIGSFGN^ 

HSVQSSTGDGIiLCIiTVNCKHKGSKIATPGMIKEAIiQRVAWIC* 

>G634 (1..798) 

ATGGAGCAAGGAGGAGGTGGTGGTGGTAATGAAGTTGTGGAGGAAGCTTCACCTATTAGT 
TCAAGACCTCCTGCTAACAACTTAGAAGAGCTTATGAGATTCTCAGCCGCCGCGGATGAC 
GGTGGATTAGGAGGTGGAGGTGGAGGAGGAGGAGGAGGAAGTGCTTCTTCTTCATCGGGA 
AATCGATGGCCGAGAGAAGAAACTTTAGCTCTTCTTCGGATCCGATCCGATATGGATTCT 
ACTTTTCGTGATGCTACTCT'CAAAGCTCCTCTTTGGGAACATGTTTCCAGGAAGCTATTG 
GAGTTAGGTTACAAACGAAGTTCAAAGAAATGCAAAGAGAAATTCGAAAACGTTCAGAAA 
TATTACAAACGTACTAAAGAAACTCGCGGTGGTCGTCATGATGGTAAAGCTTACAAGTTC 
TTCTCTCAGCTTGAAGCTCTCAACACTACTCCTCCTCCTCCTCCTTCTCATCCTCACGCT 
CATCAACCAGAACAGAAACAACAACAACAACCACAACAAGAGATGG^ 

CAATCATCATTACCATCATCATCAAG ATGG CCAAAGG CAGAGATTCTAGCGCTTATAAAC 
CTGAGAAGTGGAATGGAACCAAGGTACCAAGATAATGTACCTAAAGGACTTCTATGGGAA 
GAGATCTCAACTTCAATGAAGAGAATGGGATACAACAGAAACGCTAAGAGATGTAAAGAG 
AAATGGGAAAACATAAACAAATACTACAAGAAAGTTAAAGAAAGCAACAACAGCAACTAC 
AACAACAAGAATCAATGA 

>G634 Amino Acid Sequence (domain in aa coordinates: 62-147, 189-245) 

MEQGGGGGGNEWEEAS P I S SRPPANNLEELMRF S AAADDGGLGGGGGGGGGGS AS S S SG 

l^WPREETIiALLRIRSDMDSTFRDATIiKAPLWEHVSRKLLELGYKRSSKKC 

YYKRTKETRGGRHDGKAYKFFSQIiEALNTTPPPPPSHPHAHQPEQKQQQQPQQEMVMSSE 

QS SLPS S SRWPKAE I LAL INLRSGMEPRYQDNVPKGIiLWEE I STSMKRMGYNRNAKRCKE 

KWENINKYYKKVKESNNSNYNNKlsrQ* 

>G676 (1..612) 

atgagaaagaaagtaagtagtagtggtgacgaaggaaacaatgagtacaagaaaggtttg 
tggacagtagaagaagacaaaatcctcatggattatgtcaaagctcatggcaaaggtcac 
tggaatcgtattgccaaaaagactggtttaaagagatgtggaaagagttgtagattgagg 
tggatgaattatctcagccctaatgtgaaaagaggcaatttcaccgagcaagaagaggat 
cttatcattaggctccacaagttgcttggtaataggtggtctttaattgctaaaagagtg 
ccgggtcgaacggataatcaagtgaagaactattggaacacgcatcttagtaagaaactc 
ggaatcaaagatcagaaaaccaaacagagcaatggtgatattgtttatcaaatcaatctc 
ccgaatcctaccgaaacatcagaagaaacgaaaatctcgaatattgtcgataacaataat 
atcctcggagatgaaattcaagaagatcatcaaggaagtaactacttgagttcactttgg 
gttcatgaggatgagtttgagcttagcacactcaccaacatgatggactttatagatgga 

cactgtttttga 

>G676 Amino Acid Sequence (domain in AA coordinates: 17-119) 
MR.KKVS S S GDEGNNE YKKGL WTVEEDKI LMD YVKAHGKGHWNR I AKKTGLKRCGKS CRLR 
WMNYLSPITVKRGNFTEQEEDLIIRLHKIjLGNRWSLIAKRV 

GIKDQKTKQSNGDIVYQINLPNPTETSEETKISNIVDNNNILGDEIQEDHQGSNYL 

VHEDEFELSTIiTNMMDFIDGHCF* 

>G682 (1..228) 

ATGGATAACCATCGCAGGACTAAGCAACCCAAGACCAACTCC^TCGTTACTTCTTCTTCT 

GAAGAAGTGAGTAGTCTTGAGTGGGAAGTTGTGAACATGAGTCAAGAAGAAGAAGATTTG 

GTCTCTCGAATGCATAAGCTTGTCGGTGACAGGTGGGAACTGATAGCTGGGAGGATCCCA 

GGAAGAACCGCTGGAGAAATTGAGAGGTTTTGGGTCATGAAAAATTGA 

>G682 Amino Acid Sequence (domain in AA coordinates 27-63) 

MDNHRRTKQPKTNS I VTS S S EEVS SLEWEWNMS QEEEDLVSRMHKLVGDRWELI AGRI P 
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GRTAGBI ERF WVMKN * 
>G635 (1..993) 

ATGGAGAT CATG CGTCC^GGGGTCTCAGAAAACAC TTTGAAAGGAAAAAT AAGAATCACA 

ACGCGGTGCATGTGGCTTGACAAAGGAAGACTTTTAGATGCACTTCACAAAGCAG 

GCTGCTCTATCAAGTTGTCCTGTGACATGTCCCTTGTCTCACATGGAAAGAACAGTCTCC 

GAAGTCCTGAGGAAGATTGTAAGGAAGTACAGTGGTAAAAGGCCTGAAGTCATCGCTATA 

GCCACTGAGAATCCAATGGCTGTCCGAGCTGATGAGGTC^GTGCGAGACTGTCTGGTGAT 

CCAAGTGTTGGTTCTGGAGTTGCAGCTTTAAGGAAAGTTGTTGAAGGAAATGACAAAAGA 

AGTCGGGCGAAGAAAGCACCTTCACAAGAAGCTTCCCC 

GAAGATGATATCATTGATAGTGCAAGACTACTGGCTGAAGAAGAAACTGCGGGATCAACA 
TACACGGAAGAAGTTGATACGCCCGTTGGGAGTTCTTCAGAAGAGTCAGACGATTTTTGG 
AAATCATTCATCAATCCATCATCGTCACCTTCACCGAGTGAAACAGAAAATATGAATAAG 
GTAGCTGATACGGAGCCTAAAGCAGAGGGTAAGGAAAACAGCAGAGACGACGATGAATTA 
GCTGATGCTTCAGATTCTGAAACCAAGTCATCACCT^AAACGTGTGAGGAAGAACAAATGG 
AAACCGGAGGAGATAAAGAAGGTAATCAGAATGCGAGGAGAGCTGCAiCAGTAGATTTCAA 
GTGGTGAAAGGTAGAATGGCATTGTGGGAAGAGATCTCTTCAAATCTATCAGCTGAAGGA 
ATCAATCGAAGCCCGGGACAATGGAAATCTCTCTGGGCATCACTTATTCAGAAATACGAG 
GAGAGCAAGGCTGATGAGAGAAGCAAGACGAGTTGGCCACATTTTGAGGATATGAACAAC 
ATTTTGTCAGAGCTAGGCACACCTGCGTCTTAA 

>G635 Amino Acid Sequence (domain in AA coordinates: 239-323) 

MEII^PGVSBNTLKGKIRITTRCMWLDKGRLLDAIjHKAAHAAIjSSCPVTCPLSHiyiERTVS 

EVLRKIVRKYSGKRPEVIAIATENPMAVRA^ 

SRAKKAPSQEASPKEVDRTLEDDIIDSARLLAEEETAASTYTEEVDTPVGSSSEESDDFW 

KSFINPSSSPSPSETENMNKVADTEPKAEGKENSRDDDELADASDSETKSSPKRVRKNKW 

KPEEIKKVIRMRGELHSRFQVVKGRMALWEEISSNLSAEGINRSPGQCKSLWASIiIQKYE 

ESKADERSKTSWPHFEDMNNILSEI,GTPAS* 

>G1068 (150. .1310) 

GAGAGTTGTTAGCTAGCTCACACGCTTTCGCTTAAAACTCAAA2UVCCTGCACTTTCTCGT 
CTATTTTCTCGGCATTCGTAAAACAGAAAAGTGGGTCTCCAAGAAAATTACCCTAAATTC 
ACAAAGATTCATACTTTTCTCCACCTCCAATG^ 

AGCAACAACAACAACAACAACAG C AG C AG C AG CAAC AAC AGCAAC AT CTAC AAC AAC AG C 
AACAACCACCGCCAGGGATGTTAATGAGTCACCACAATTCCTACAATCGAAACCCTAACG 
CCGCCGCCGCTGTTTTAATGGGTCACAACACCTCGACAT 

TACCTTTTGGTGGTTCTATGTCACCGCATCAGCCTCAACAACATGAGTATCATCATCCTC 

AGCCTCA.GCAACAGATAGATCAGAAGACTCTTGAATCTCTTGGATTTCCTACTTCGCCTC 

TTCCTTCTGCTTCTAATTCTTACGGTGGTGGAAATGAAGGAGGTGGTGGTGGTGATAGCG 

CCGGAGCTAATGCTAACTCTTCCGATCCACCTGCTAAACGGAACAGAGGACGTCCTCCTG 

GCTCCGGTAAGAAGCAGCTCGATGCTTTAGGAGGAACAGGAGGAGTTGGGTTCACGCCTC 

ATGTCATTGAGGTTAAAACAGGAGAGGACATAGCTACGAAGATATTGGCGTTTACGAACC 

AAGGGCCACGCGCAATCTGTATTCTCTCAGCTACAGGAGCTGTAACTAATGTGATGCTTC 

GTCAAGCTAACAATAGCAATCCTACTGGAACTGTTAAGTATGAGGGCCGATTTGAAATCA 

TTTCTCTGTCAGGTTCTTTCTTGAATTCTGAGAGTAATGGTACTGTGACCAAAACTGGTA 

ACTTGAGTGTGTCGCTGGCTGGACACGAAGGCCGGATTGTGGGTGGATGTGTTGATGGAA 

TGCTAGTAGCTGGATCACAAGTCCAGGTCATTGTGGGAAGCTTTGTACCAGATGGAAGGA 

AGCAGAAACAAAGTGCGGGGCGTGCTCAGAATACTCCGGAGCCAGCTTCAGCACCAGCCA 

ATATGTTGAGCTTTGGTGGTGTTGGTGGACCGGGAAGCCCTCGATCTCAAGGACAACAAC 

ACTCGAGCGAGTCATCAGAGGAAAACGAAAGTAATTCTCCGTTGCACCGTAGAAGCAACA 

ACAACAACAGCAACAATCATGGGATATTTGGAAACTCTACACCTCAACCGCTTCACGAAA 

TTCCTATGCAGATGTACCAGAATCTCTGGCCTGGCAACAGTCCTCAATAAACAGATGGTT 

CATGGGTCAAGATTTGACCGGGTTTGCTTCTCTGTTCCTTTTGACACATCTCTCCATC^ 

ATTTATCTCTATAAAGTAGATTGAGCTCTCTTACTCTCTCATCTTCTTCTCCTTTACTAT 

TTCTCTTAAATTTAGCTTTGGTTTTAGATAAATAGAGAGAGAGAGACATGTTAAGTAGGT 

TTCAAATTCAATCTTGTTTAGTTTGTTTCTTAGTAGTTTCTTTTGATTGTGATGATCATA 

AAGACTTGTTCTTTTTCTCCTATATTCAACGAATTATCCACTTTAA 

>G1068 Amino Acid Sequence (domain in AA coordinates: 143-150) 

MDSREIHHQQQQQQQQQQQQQQQQQHLQQQQQPPPGMLMSHHNSYNRNPNAAAAVLMGHN 

TSTSQAMHQRLPFGGSMSPHQPQQHQYHHPQPQQQIDQKTLESI1GFPTSPI1PSASNSYGG 
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GNEGGGGGDSAGANANSSDPPAKRNRGRPPGSGKKQLDALGGTGGVGFTPHVIEVKTGED 
IATKILAFTNQGPRAICILSATGAVTNVMIjRQANNSNPTGTVK^ 

ESNGTVTKTGNLSVSLAGHEGRIVGGCVDGMLVAGSQVQVIVGSFVPDGRKQKQSAGRAQ 
OTPEPASA^ANMLSFGGVGGPGSPRSQGQQHSSESSEENESNSPLHI^ 
GNSTPQPIiHQIPMQMYQNLWPGNSPQ* 
>G1225 (1..984) 

ATGACTCTAGAAGCTTTATCATCAAACGGTCTTTTAAACTTTTTGCTCTCTGAAACTCTT 
TCACCAACTCCATTC^GTCTCTCGTCGATCTCGAGCCATTGCCGGAAAATGATGTCATC 
ATATCGAAGAACACAATTTCGGAGATATCTAATCAAGAACCGCCACCACAGCGACAACCA 
CCAGCTACGAATCGAGGGAAGAAGCGGCGGAGGAGGAAGCCTAGGGTTTGCAAAAACGAG 
GAAGAAGCTGAGAATCAACGAATGACTCACATTGCCGTCGAAAGAAATCGAAGAAGACAA 
ATGAATCAACATCTCTCTGTCTTGCGATCTCTCATGCCTCAACCTTTTGCTCACAAGGGT 
GATCAAGCTTCAATAGTTGGTG GAGC CATAGATTTCATCAAAGAACTTGAACACAAATTA 
CTATCTCTTGAAGCTCAAAAACATCATAATGCT^ 

AC^VGTCAAGACTCAAATGGTGAACAAGAGAATCCTCATCAACCATCTTCACTATCTCTA 
TCGCAGTTCTTTCTTCATTCATACG 

TCGGTGAAAACCCCTATGGAAGATCTTGAGGTGACTCTAATCGAAACTCATGCTAACATC 
AGAATCTTGTCGAGAAGAAGAGGTTTCCGGTGGAGC^CGTTGGCCACCACCAAACCGCCG 
CAGCTTTCGAAGCTGGTGGCTTCTCTACAATCGCTGTCCCTCTCCATTCTTCACCTTAGT 
GTCACAAGATTGGACAATTATGCTATTTACTCCAT 

CAGCTAAGTTCAGTAGATGACATTGCAGGAGCAGTTCACCA.CATGCTAAGTATCATTGAA 
GAGGAGCCTTTTTGTTGCTCATCAATGTCAGAATT^ 

TCAAATGTCACTCATTCTCTCTGAGAAATCTCTTTTTTGTTGTTGTTATTCC 
ATTTTATCACATAGCACATCTTTAGTTTTTTTTTTT 

>G1225 Amino Acid Sequence (domain in AA coordinates: 78-147) 

MTLEAIjSSNGLIjNFLIjSETI*SPTPFKSIjVDIiEPLPENDVIISKNTISEISNQEPPPQRQP 

PATNRGKKRRRRKPRVCKNEEEAENQRMTHIAVERNRR^ 

DQASIVGGAIDFIKELEHKLLSLEAQKHHNAKIJ^QSVTSSTSQDSNGEQENPHQPSSLSL 
SQFFLHSYDPSQENRNGSTSSVKTPMEDLK\TTLIETHANIRILSRRRGFRWSTIiATTKPP 
QLSKLVASLQSLSLS ILHLS VTTLDNYAI YS I S AKVEESCQLS S VDDIAGAVHHMLS I IE 
EEPFCCSSMSETjPFDFSLNHSNVTHSL* 
>G1337 (97.. 1398) 

AATGGATTTGTCATCATTCTTCTGACCGTCCTTAGTCTCTGAAAATAAATTCTGATTTTG 

ATTTCGAATTTTAGGGATTTTGAGAGAGAGTCAGTTATGAGTAGTTCGGAGAGAGTACCG 

TGCGATTTCTGCGGCGAGCGTACGGCGGTTTTGTTTTGTAGAGCCGATACGGCGAAGCTG 

TGTTTGCCTTGTGATCAGCAAGTTCACACGGCGAATCTGTTGTCGAGGAAGCACGTGCGA 

TCTCAGATCTGCGATAATTGCGGTAACGAGCCAGTCTCTGTTCGGTGTTTCACCGATAAT 

CTGATTTTGTGTCAGGAGTGTGATTGGGATGTTCACGGAAGTTGTTCAGTTTCCGATGCT 

CATGTTCGATCCGCCGXGGAAGGTTTTTCCGGTTGTCCATCGGCGTTGGAGCTTGCTGCT 

TTATGGGGACTTGATTTGGAGCAAGGGAGGAAAGATGAAGAGAATCAAGTTCCGATGATG 

GCGATGATGATGGATAATTTCGGGATGCAGTTGGATTCTTGGGTTTTGK3GATCTAATGAA 

TTGATTGTTCCCAGCGATACGACGTTTAAGAAGCGTGGATCTTGTGGATCTAGTTGTGGG 

AGGTATAAGCAGGTATTGTGTAAGCAGCTTGAGGAGTTGCTTAAGAGTGGTGTTGTCGGT 

GGTGATGGCGATGATGGTGATCGTGACCGTGATTGTGACCGTGAGGGTGCTTGTGATGGA 

GATGGAGATGGAGAAGCAGGAGAGGGGCTTATGGTTCCGGAGATGTCAGAGAGATTGAAA 

TGGTCAAGAGATGTTGAGGAGATCAATGGTGGCGGAGGAGGAGGAGTTAACCAGCAGTGG 

AATGCTACTACTACTAATCCTAGTGGTGGCCAGAGTTCTCAGATATGGGATTTTAACTTG 

GGACAGTCACGGGGACCTGAGGATACGAGTCGAGTGGAAGCTGCATATGTAGGGAAAGGT 

GCTGCTTCTTCATTCACAATCAACAATTTTGTTGACCATATGAATGAAACTTGTTCCACT 

AATGTGAAAGGTGTC^U^GAGATTAAAAAGGATGACTACAAGCGATCAACTTCAGGCCAG 

GTACAACCAACAAAATCTGAGAGCAACAATCGTCCAATTACCTTTGGCTCTGAGA 

TCGAACTCCTCCAGTGACTTGCATTTCACAGAGCATATTGCTGGAACTAGTTGTAAGACC 

ACAAGACTAGTTGCAACTAAGGCTGATCTGGAGCGGCTGGCTCAGAACAGAGGAGATGCA 

ATGCAGCGTTACAAGGAAAAGAGGAAGACACGGAGATATGATAAGACCATAAGGTATGAA 

TCGAGGAAGGCAAGAGCTGACACTAGGTTGCGTGTCAGAGGCAGATTTGTGAAAGCTAGT 

GAAGCTCCTTACCCTTAACCTTAAGTTTTTTCACATAGGCTTCCTTTTAGCTACAAACTT 

AGTTACTTTTTTTACTCCACTGCCTCATAAATGTACAGACCGGTCTCGTTTCATCTGGCC 
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GCCCTTCTTGTTTTATTGCCTTATCTGGCCCTTTTATGTACCTTGGAATCTTATCTAGTT 
TAAAAAAGATTGTAACCTTCTAGAAAACCATATTCTGTTGACAGTATATACATGTCTATC 
CAAGCAAAAA 

>G1337 Amino Acid Sequence (domain in AA coordinates: 9-75) 
MSSSERVPCDFCGERTAVLFCRADTAKIiCLPCDQQVHTANLLSRKHVRSQICDNCGNEPV 
SVRCFTDNIilLCQECDWDVHGSCSVSDAHVRSAVEGFSGCPSALELAAIiWGLDLEQGRKD 
EENQVPMMAMMITONFGMQLDS 

LLiKSGVVGGDGDDGDRDRDCDREGACDGDGDGEAGEGLMVPEMSERIiKWSRDVEEINGGG 

GGGVNQQWNATTTNPSGGQSSQIWDFNIjGQSRGPEDTSRVEAAYVGKGAASSFTINNFVD 

HMNETCSTNVKGVKEIKKDDYKRSTSGQVQPTKSESNNRPITFGSEKGSNS^ 

IAGTS CKTTRIiVATKADLERIiAQNRGDAMQRY KE KRKTRR YDKT I RYE SRKARADTRLRV 

RGRFVKAS EAP YP * 

>G1759 (110.. 700) 

CGAGAAAAGGAAAAAAAAAAATAGAAAGAGAAAACGCTTAGTATCTCCGGCX5ACTTGAAC 
CCAAACCTGAGGATCAAATTAGGGCACAAAGCCCTCTCGGAGAGAAGCCATGGGAAGAAA 
AAAACTAGAAATCAAGCGAATTGAGAACAAAAGTAGCCGACAAGTCACCTTCTCC^AACG 
TCGCAACGGTCTCATCGAGAAAGCTCGTCAGCTTTCTGTTCTCTGTGACGCATCCGTCGC 
TCTTCTCGTCGTCTCCGCCTCCGGCAAGCTCTACAGCTTCTCCTCCGGCGATAACCTGGT 
CAAGATCCTTGATCGATATGGGAAACAGCATGCTGATGATCTTAAAGCCTTGGATCATCA 
GTCAAAAGCTCTGAACTATGGTTCACACTATGAGCTACTTGAACTTGTGGATAGCAAGCT 
TGTGGGATCAAATGTCAAAAATGTGAGTATCGATGCTCTTGTTCAACTGGAGGAACACCT 
TGAGACTGCCCTCTCCGTGACTAGAGCCAAGAAGACCGAACTCATGTTGAAGCTTGTTGA 
GAATCTTAAAGAAAAGGAGAAAATGCTGAAAGAAGAGAACCAGGTTTTGGCTAGCCAGAT 
GGAGAATAATCATCATGTGGGAGCAGAAGCTGAGATGGAGATGTCACCTGCTGGACAAAT 
CTCCGACAATCTTCCGGTGACTCTCCCACTACTTAATTAGCCACCTTAAATCGGCGGTTG 
AAATCAAAATCCAAAACATATATAATTATGAAGAAAAAAAAAATAAGATATGTAATTAOT 
CCGCTGATAAGGGCGAGCGTTTGTATATCTTAATACTCTCTCTTTGGCCAAGAGACTTTG 
TGTGTGATACTTAAGTAGACGGAACTAAGTCAATACTATCTGTTTTAAGACAAAAGGTTG 
ATGAACTTTGTACCTTATTCGTGTGAGAAAAAAAAAAAAAAAA 

>G1759 Amino Acid Sequence (conserved domain in AA coordinates: 2-57) 

MGRKKLE I KRIENKS S RQVTFSKRRNGL IEKARQIiSVLCDAS VALLWSASGKLYSFS SG 

DNIj VICILDRYGKQHADDLKALDHQSKALNYGSHY^ IDALVQLi 

EEHLETAIiSWRAKKTELMLKLVENLKEKEKMLKEENQVI^ 

AGQISDNLPVTLPLLN* 

>G1804 (169. .1497) 

TATCTCTCTCTTTCTCAAAACCTTTC^GTC^AAATTCTCCGGCGGCTTTTAAACTATGTG 
AAGGAGGAGAACCTCC^TAACAAGAAGCGGATTCTCTCAGTTTTCCGGCGGCGGAGGAAC 
ACAAAGCCACCGGTTTTTAGACACACAGATTTCATTTTCAGTTGTTAAATGGTAACTAGA 
GAAACGAAGTTGACGTCAGAGCGAGAAGTAGAGTCGTCCATGGCGCAAGCGAGACATAAT 
GGAGGAGGTGGTGGTGAGAATCATCCGTTTACTTCTTTGGGAAGACAATCCTCTATCTAC 
TCATTGACCCTTGACGAGTTCCAACATGCTTTATGTGAGAACGGCAAGAACTTTGGGTCC 
ATGAACATGGACGAGTTTCTTGTCTCTATTTGGAACGCAGAGGAGAATAATAACAATCAA 
CAACAAGCAGCAGCAGCTGCAGGTTCACATTCTGTO 

AACAACAATAACAATGGAGGCGAGGGTGGTGTTGGTGTCTTTAGTGGTGGTTCTAGAGGC 

AACGAAGATGCTAACAATAAGAGAGGGATAGCGAACGAGTCTAGTCTTCCTCGACAAGGC 

TCTTTGACACTTCCAGCTCCGCTTTGTAGGAAGACTGTTGATGAGGTTTGGTCTGAGATA 

CATAGAGGTGGTGGTAGCGGTAATGGAGGAGACAGCAATGGACGTAGTAGTAGTAGTAAT 

GGACAGAACAATGCTCAGAACGGCGGTGAGACTGCGGCTAGAC^^CCGACTTTTGGAGAG 

ATGACACTTGAGGATTTCTTGGTGAAGGCTGGTGTGGTTAGAGAACATCCCACTAATCCT 

AAACCTAATCCAAACCCGAACCAAAACCAAAACCCGTCTAGTGTAATACCCGCAGCTGCA 

CAGCAAGAGCTTTATGGTGTGTTTCAAGGAACCGGTGATCCTTCATTCCCGGGTCAAGC 

ATGGGTGTGGGTGACCCATCAGGTTATGCTAAAAGGACAGGAGGAGGAGGGTATCAGCAG 

GCGCCACCAGTTCAGGCAGGTGTTTGCTATGGAGGTGGCGTTGGGTTTGGAGCGGGTGGA 

CAGCAAATGGGAATGGTTGGACCGTTAAGCCCGGTGTCTTCAGATGGATTAGGACATGGA 

CAAGTGGATAACATAGGAGGTCAGTATGGAGTAGATATGGGAGGGCTAAGGGGAAGGAAA 

AGAGTAGTGGATGGTCCAGTGGAGAAAGTAGTGGAGAGAAGACAGAGGAGGATGATCAAG 

AACCGCGAGTCTGCTGCTAGATCTAGAGCAAGAAAACAAGCATATACAGTGGAATTGGAA 
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GCTGAACTTAACCAGTTGAAAGAAGAGAATGCGCAGCTAAAACATGCATTGGCGGAGTTG 
GAGAGGAAGAGGAAGCAACAGTATTTTGAGAGTTTGAAGTCAAGGGCACAACCGAAATTG 
CCGAAATCGAACGGGAGATTGCGGACATTGATGAGGAACCCGAGTTGTCCACTCTAAACA 
AACAATAGGAAGATGGAGAAGAAGTCGGAGACAGAACGAGGGAAAAACTGATGATTTTCT 
ACGTTGTTGTTTTGTCTTTGAGGAATGAGGTTATAGAATCTTTATACTTTGATGTTTTCT 
GTGTTGGTAGGAGG2UVCACCATCTGATCTGCTTTACTAGTGTTCCCTGTGAACAAAGAAA 
GTGATTCTGTGTTTCAACATCATCAATCTTTGGAAA 

>G1804 Amino Acid Sequence (domain in AA coordinates: 357-407) 

MVTRETKLTSEREVESSMAQARHNGGGGGENHPFTSLGRQSSIYSLTIjDBFQHALCENGK 

NFGSMNlvnDEFIjVSIWNAEENlTblNQQQAAAAAGSHSVPA 

GSRGNEDANNKRGIANESSIiPRQGSIjTIiPAPLCRKTVDEVWSEIHRGGGSGNGGDSNGRS 

S S SNGQNNAQNGGETAARQPTFGE^^^LEDFLVKAGVVREHPTNPKPNPNPNQNQNPS S VI 

PAAAQQQIiYGVFQGTGDPSFPGQAMGVGDPSGYAKRTGGGGYQQAPPVQAGVCYGGGVGF 

GAGGQQMGMVGPLSPVSSIXJLGHGQVDNIGGQYGVDMGGLRGRKRVVDGPVEKVVEIiRQR 

RMIKNRESAARSRARKQAYTVELEAELNQLKEENAQLI^^ 

QPKLPKSWGRLRTLMRNPSCPIi* 

>G207 (16.. 930) 

aaaagatctgtttcaatggcggatcgtgttaaaggtccatggagtcaagaagaagatgag 
cagctacgaaggatggttgagaaatacggaccgaggaattggtctgcgattagcaaatcg 
attccaggtcgatctggtaaatcgtgtagattacgttggtgtaatcagttatctccggag 
gttgagcatcgtcctttctcgccggaggaagatgagactattgtaaccgcccgtgctcag 
tttggtaacaagtgggcgacgattgctcgtcttcttaacggtcgtacggataacgccgtt 
aaaaatcactggaactctacgcttaagaggaaatgcagcggaggtgtggcggttacgacg 
gtgacggagacggaggaagatcaggatcggccgaagaagaggagatctgttagctttgat 
cctgcttttgctccggtggatactggattgtacatgagtcctgagagtcctaacggaatc 
gatgttagtgattctagcacgattccgtcaccgtcgtctcctgttgctcagctgtttaaa 
ccaatgccgatttccggcggtfcttacggtggttccgcagccgttaccggttgaaatgtct 
tcgtcttcggaggatccacctacttcgttgagtttgtcactacctggagctgagaacacg 
agttcgagccataacaataacaacaacgcgttgatgtttccgagatttgagagtcagatg 
aagattaatgtagaggagagaggaggaggaggagaaggacgtagaggtgagtttatgacg 
gtggtgcaggagatgataaaagctgaagtgaggagttacatggcggaaatgcagaaaaca 
agtggtggattcgtcgtcggaggtttatacgaatccggcggcaatggtggttttagggat 
tgtggagtaataacacctaaggttgagtagttttggtttagggttaaaacttgaatcgat 
tggggattttcaagagcattcatttttggggtttatggtaaaattaaaaacaaaaacaaa 
atgtacagaggaattaaaatttctatggaataatcttaaatctcaaatatttgttacttg 
ttttggtgattcataaccaaaatcaaa 

>G207 Amino Acid Sequence (domain in AA coordinates: 6-106) 

MADRVKGPWSQEEDEQLRRMVEKYGPRNWSAISKSIPGRSGKSCRLRWCNQLSPEVEHRP 

FS PEEDETIVTARAQFGNKWAT IARLIiNGRTDNAVKNHVOJSTLKRKCSGGVAVTTVTETE 

EDQDRPKKRRSVSFDPAFAPVDTGLYMSPESPNGIBVSDSSTIPSPSSPVAQLFKPMPIS 

GGFTWPQPLPVEMSS SSEDPPTSLSLSIiPGAENTS S SHITNNNNAIjMFPRFESQMKINVE 

ERGGGGEGRRGEFMTWQEMIKAEVRSYMAEMQKTSGGFWGGLYESGGNGGFRDCGVIT 

PKVE* 

>G218 (1..1182) 

ATGGAGGCAGAGATCGTGAGACGATCGGAGGTAACGGGATTAAGAAGGGAGGTGGAAGAA 
TCGTCAATTGGTAGAGGAGATTGCGATGGTGATGGCGGCGATGTGGGAGAAGATGCGGCA 
GGGTTCGTTGGGACGAGCGGGAGAGGAAGAAGAGATCGAGTTAAAGGGCCGTGGTCGAAG 
GAGGAGGATGATGTGTTGAGTGAGCTCGTTAAGAGGTTGGGAGCGAGGAATTGGAGTTTT 
ATCGCTCGGAGTATTCCTGGTCGTTCAGGCAAGTCTTGTCGTCTTCGTTGGTGTAATCAG 
CTCAATCCAAATCTTATACGCAATTCATTTACTGAGGTAGAGGATCAGGCTATCATCG^ 
GCACATGCCATCCACGGAAACAAATGGGCTGTTATCGCGAAGCTCCTCCCCGGAAGAACA 
GATAATGCTATCAAGAACCACTGGAACTCTGCTTTAAGACGTCGATTCATAGACTTTGAA 
AAGGCCAAGAATATAGGAACTGGAAGCTTGGTCGTGGATGATTCTGGATTTGACAGAACG 
ACAACAGTAGCCTCATCAGAAGAAACTTTATCTTCAGGCGGTGGTTGCCATGTAACTACT 
CCAATTGTATCTCCAGAAGGCAAAGAAGCTACCACCTCCATGGAAATGTCTGAAGAACAA 
TGCGTAGAGAAAACAAACGGAGAAGGTATTTCTAGGCAAGATGATAAGGATCCTCCAACG 
CTTTTCCGCCCAGTGCCTCGGCTCAGTTCTTTTAATGCTTGCAATCACATGGAAGGATCA 
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CCCTCTCCACATATACAAGACCAAAATCAGCTCCAATCATCTAAACAAGACGCAGCAATG 

CTAAGATTGCTTGAAGGAGCTTACAGCGAACGGTTTGTGCCTCAAACATGTGGAGGTGGT 

TGTTGCAGCAACAATCCCGATGGCAGTTTTCAGCAAGAATCATTGTTGGGTCCAGAGTTT 

GTGGATTACTTAGACTCACCAACGTTTCCGAGTTCCGAACTAGCTGCTATAGCAACGGAA 

ATAGGCAGCCTCGCTTGGCTGAGAAGCGGTTTAGAGAGTAGCAGCGTGAGGGTGATGGAA 

GACGCAGTTGGTCGGTTAAGGCCTCAAGGCTCCAGGGGTCATCGAGATCATTATCTTGTA 

TCTGAACAGGGGACGAACATAACCAATGTCCTGTCCACATAA 

>G218 Amino Acid Sequence (domain in AA coordinates: TBD) 

MEAEIVRRSEVTGLRREVEESS1GRGDCDGDGGDVGEDAAGFVGTSGRGRRDRVKGPWSK 

EEDDVLSELVKRIiGARNWSFIARSIPGRSGKSCRLRWCN^^ 

AHAI HGNKWAVIAKLLPGRTDNAI KNHWNSALRRRF IDFEKAKWIGTGSLVVDDSGFDRT 
TTVASSEETLSSGGGCHVTTPIVSPEGKEATTSMEMSEEQC^ 

LFRPVPRIiSSFNAOTHMEGSPSPHIQDQNQIiQSSKQDAAMLRIjLEGAYSERFVPQTCGGG 

CCSNNPDGSFQQESIiliGPEFVDYLDSPTFPSSEIjAAIATEIGSIjAWLRSGIjESSSVRVME 

DAVGRLRPQGSRGHRDHYLVSEQGTNITNVLST* 

>G241 (46.. 867) 

GAAAAA(^TTTGAACTTCTTTTAT 

TGCTGTGAGAAGATGGGGTTGAAGAGAGGACCATGGACACCTGAAGAAGATCAAATCTTG 
GTCTCTTTTATCCTCAACCATGGACATAGTAACTGGCGAGCCCTCCCTAAGCAAGCTGGT 
CTTTTGAGATGTGGAAAAAGCTGTAGACTTAGGTGGATGAACTATTTAAAGCCTGATATT 
AAACGTGGCAATTTCACCAAAGAAGAGGAAGATGCTATCATCAGCT^ 

GGCAATAGATGGTCAGCGATTGCAGCAAAACTGCCTGGAAGAACCGATAACGAGATCAAG 
AACGTATGGCACACTCACTTGAAGAAGAGACTCGAAGATTATCAACCAGCTAAACCTAAG 
ACCAGCAACAAAAAGAAGGGTACTAAACCAAAATCTGAATCCGTAATAACGAGCTCGAAC 
AGTACTAGAAGCGAATCGGAGCTAGCAGATTCATCAAACCCTTCTGGAGAAAGCTTATTT 
TCGACATCGCCTTCGACAAGTGAGGTTTCTTCGATGACACTCATAAGCCACGACGGCTAT 
AGCAACGAGATTAATATGGATAACAAACCGGGAGATATCAGTACTATCGATCAAGAATGT 
GTTTCTTTCGAAACTTTTGGTGCGGATATCGATGAAAGCTTCTGGAAAGAGACACTGTAT 
AGCCAAGATGAACACAACTACGTATCGAATGACCTAGAAGTCGCTGGTTTAGTTGAGATA 
CAAC^GAGTTTCAAAACTTGGGCTCCGCTAATAATGAGATGATTTTTGACAGTGAGATG 
GAACTTCTGGTTCGATGTATTGGCTAGAACCGGCGGGGAACAAGATCTCTTAGCCGGGCT 
CTAGTTAACATGTTTGAGGAGTAAAGTGAAATGGTGCAAATTAGTTAAGGCTAAGAAATT 
CAAAAGCTTTTGTTTACCGAGAAAAAAACACACTCTAACTCTTGATGTGATGTAGTTAGT 
GTATTAATTAGAGGCTGCGTTTTCAA 

>G241 Amino Acid Sequence (domain in AA coordinates: 14-114) 
MGRAPCCEKMGLK3*GPWTPEEDQILVSFILim^ 

LKPD I KRGNFTKEEEDAI IS LHQILGNRWS AI AAKLPGRTDNE I KNVWHTHLKKRLED YQ 

PAKPKTSNKKKGTKPKSESVITSSNSTRSESELADSSNPSGESIiFSTSPSTSEVSSMTLI 

SHDGYSNEINMDNKPGDISTIDQECVBFETFGADIDESFWK^ 

GLVEIQQEFQNLGSAKNEMIFDSEMEIiLVRCIG* 

>G254 (15.. 923) 

CGATTTCGAGCTCTATGGTGTCCGTAAACCCTAGACCTAAGGGTTTTCCAGTTTTCGATT 
CCTCGAATATGAGTTTACCAAGCTCCGATGGATTTGGTTCGATTCCGGCCACGGGACGGA 
CCAGTACGGTGTCGTTTTCTGAGGATCCGACGACGAAGATTCGGAAGCCGTACACAATCA 
AGAAGTCGAGAGAGAATTGGACAGATCAAGAGCACGATAAATTTCTAGAAGCTCTTCACT 
TATTCGATAGGGATTGGAAGAAAATAGAAGCCTTTGTTGGATCAAAAACAGTAGTTCAGA 
TACGAAGCCACGCTCAGAAATACTTTCTCAAAGTTCAGAAGAGTGGTGCTAACGAACATC 
TTCCACTTCCTCGACCTAAGAGGAAAGCGAGTCATCCTTATCCTATAAAGGCTCCTAAAA 
ATGTTGCTTATACCTCTCTCCCGTCTTCGAGTACATTACCGTTGCTTGAGCCTGGTTATT 
TGTATAGCTCTGATTCGAAGTCATTGATGGGAAACCAGGCTGTTTGTGCATCTACCTCTT 
CTTCGTGGAATCATGAATCGACAAATCTGCCAAAACCGGTGATTGAAGAGGAACCGGGAG 
TCTCGGCCACGGCTCCTCTCCCAAATAATCGCTGCAGACAGGAAGATACAGAGAGGGTAC 
GAGCAGTGACAAAGCCAAATAACGAAGAAAGTTGTGAAAAGCCACATAGAGTGATGCCGA 
ATTTTGCTGAAGTTTACAGCTTCATTGGAAGTGTCT^ 

TCCAGAGATTA7^GCAGATGGATCCAATAAATATGGAAACGGTTCTTTTACTGATGCAAA 
ACCTGTCTGTAAATCTGACAAGTCCCGAGTTTGCAGAGCAAAGGAGGTTGATATCATCAT 
ACAGCGCTAAAGCTTTGAAATAGAGATAGAATAAAACAATAATGTACCTTATGTGAGATC 
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AAGAGACAATCATCCAAGGTCTGTATGCATTGCTTGGATTTAGGCCTCGTGTTCTCACTA - 

CAGGAGCAGAACCAATCGCAAAGACTCTTAGATGGCTACTGAGTTGTGGTTTTTATGTCT 

CTGTAAGTCGCGGTGGAGCACACGTGTTTGTCCTGTCTTGTGTATGTGTGTATAGATAAT 

ACAAGGTTTTGCAGAGTAAGGTCACAGTTAGCTGCAAGTGAGTTTGGATCAATCTTAAGA 

TTAAAACCCTGAGAGTGAGTGTCCAAAGAGACTGTGTAATATTGGTTTGGCGGTCAGCAG 

AAGAGTTTTGAAGTGCACATCCAGTTAGTGATAACACGGTTGAAGAAAAGGTAAGGTTAC 

AAGT^TAGTTTTGAATAATTGTATACTCAAAAAATATGAATGTATAAAGAATAATCACTT 

GAGTCGCCTTA 

>G254 Amino Acid Sequence (domain in AA coordinates: 62-106) 

MVSVNPRPKGFPVFDSSNMSLPSSDGFGSIPATGRTSTVSFSEDPTTKIRKPYTIKKSRE 

NWTDQEHDKFLEALHIiFDRDWKKI EAFVGSKTVVQIRSHAQKYFLKVQKSGANEHLPLPR 

PKRKASHPYPIKAPKNVAYTSIiPSSSTLPIiLEPGYLYSSDSKSLMGNQAVCASTSSSWNH 

ESTNLPKP VI EEEPGVSATAPLPNNRCRQEDTERVRAVTKPNNEE S CEKPHRVMPNFAEV 

YSFIGSVFDPNTSGHLQRLKQMDPINMET 

LK* 

>G26 (73.. 729) 

TTGGCTTGTACCCAAACCCATCTTTGACTTCAAAAATAAAATAAAAATAATCATAATTGA 
CATCATCGGATAATGCATAGCGGGAAGAGACCTCTATCACCAGAATCAATGGCCGGAAAT 
AGAGAAGAGAAAAAAGAGTTGTGTTGTTGCTCAACTTTGTCGGAATCTGATGTGTCTGAT 
TTTGTCTCTGAACTCACTGGTGAACCCATCCCATC^TCCATTGATGATCAATCTTCGTCG 
- CTTACTCTTCAAGAAAAAAGTAACTCGAGGCAACGAAACTACAGAGGCGTGAGGCAAAGA 
CCGTGGGGAAAATGGGCGGCTGAGATTCGTGACCCGAACAAGGCAGCTCGTGTGTGGCTT 
GGGACGTTCGACACTGCAGAAGAAGCCGCCTTAGCGTATGATAAAGCTGCATTTGAGTTT 
AGAGGTCACAAGGCCAAGCTTAACTTCCdCGAGCATATTCGTGTGAACCCTACTCAACTC 
TATCCATCGCCCGCTACTTCCGATGATCGCATTATCGTGACACCACCTAGTCCACCTCCA 
CCAATTGCTCCTGACATACTTCTTGATCAATATGGCCACTTTGAATCTCGAAGTAGTGAT 
TCCAGTGCCAACTTGTCC^TGAATATGCTGTCTTCTTCGTCTTC^TCTTTGAATC^TC^ 
GGGCTAAGACCAAATTTGGAGGATGGTGAAAACGTGAAGAACATTAGTATCCACAAACGA 
CGAAAATAACATGTTAATGGCATAAATATCTCTTCGTCCAAGTTATCAAACGCATTGACC 
TCCGGCTTTGATCATTTTAGGCGCTTAATCTCTTTACGACTTCATTTTGGTAGTCTTTAA 
AGAGTCTATGGAGTGGATTTAGCTAGGAATCAGGCCTTATGGATGAAAAATATATAAATT 
TTGAACATGACTATGCAAGAATGGGATGAAGACTACTTAGCTTGGAAAACGTCCTGATAG 
GTCATGACGACTATATCCACAGAAGATGACCGACGGAGACAACAACATGCCTCACCTGAT 
CGACCGATCAAATGAGATAATGTGTTGACCGGACCGGTCGGATCAGGTTGGGTCGAGTAT 

ATCA 

>G26 Amino Acid Sequence (domain in AA coordinates: 67-134) 
MHSGKRPLSPESMAGNREEKKELCCCSTLSESDVSDFVSEIiTGQPIPSSIDDQSSSLTLQ 

EKSNSRQRimtGWQRPWGKWAAEIRDPNKAARVWI^ 

AKLNFPEHIRVNPTQDYPSPATSHDRIIVTPPSPPPPIAPDILLDQYGHFQSRSSDSSAN 
LSMNMLS S S S S SLNHQGLRPNLEDGENVKN I S IHKRRK* 
>G263 (48.. 902) 

TTTTTAGTTTTATTTTTCTGTGGTAAAATAAAAAAAGTTCGCCGGAGATGACGGCTGTGA 

CGGCGGCG(^AAGATC^GTTCCGGCGCCGTTTTTAAGCAAAACGTATCAGCTAGTTGATG 

ATGATAGCACAGACGACGTCGTTTCATGGAACGAAGAAGGAACAGCTTTTGTCGTGTGGA 

AAAGAGCAGAGTTTGC^AAAGATCTTCTTCCTCAATACTTCAAGCATAATAATTTCTCAA 

GCTTCATTCGTCAGCTCAACACTTACGGATTTCGTAAAACTGTACCGGATAAATGGGA^ 

TTGCAAACGATTATTTCCGGAGAGGCGGGGAGGATCTGTTGACGGACATACGACGGCGTA 

AATCGGTGATTGCTTCAACGGCGGGGAAATGTGTTGTTGTTGGTTCGCCTTCTGAGTCTA 

ATTCTGGTGGTGGTGATGATCACGGTTCAAGCTCCACGTCATCACCCGGTTCGTCGAAGA 

ATCCTGGTTCGGTGGAGAACATGGTTGCTGATTTATCAGGAGAGAACGAGAAGCTTAAAC 

GTGAAAACAATAACTTGAGCTCGGAGCTCGCGGCGGCGAAGAAGCAGCGCGATGAGCTAG 

TGACGTTCTTGACGGGTCATCTGAAAGTAAGACCGGAACAAATCGATAAAATGATCAAAG 

GAGGGAAATTTAAACCGGTGGAGTCTGACGAAGAGAGTGAGTGCGAAGGTTGCGACGGCG 

GCGGAGGAGC^GAGGAGGGGGTAGGTGAAGGATTGAAATTGTTTGGGGTGTGGTTGAAAG 

GAGAGAGAAAAAAGAGGGACCGGGATGAAAAGAATTATGTGGTGAGTGGGTCCCGTATGA 

CGGAAATAAAGAACGTGGACTTTCACGCGCCGTTGTGGAAAAGCAGCAAAGTCTGCAACT 

AAAAAAAGAGTAGAAGACTGTTCAAACCAGCGTGTGACACGTCATCGACGACGACGAAAA 
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AAATGATTTAAAAAACTATTTTTTTCCGTAAGGAAGAAAAGTTATTTTTATGTTTTAAAA 
AGGTGAAGAAGGTCCAGAAGGATCAACGCAAATATATAAATGGATTTTCATGTATTATAT 
AATTTAATTAGTGTATTAAGAAAA 

>G263 Amino Acid Sequence (domain in AA coordinates: TBD) 
MTAVTAAQRS VP AP FLS KT YQLVDDHS TDD WS WNEEGT AF WWKTAE FAKDLLPQ YF KH 
NNFSSFIRQblSrrYGFRKTVPDKWEFANDYFRRGGEDLLTDIRRRKSVIASTA^ 
PSESNSGGGDDHG S S STS S PGS S KNPG SVEN^fVADLSGENEKIJKRENNNIJSSEIlAAAKKQ 
RDEIiVTFLTGHLKVRPEQIDKMIKGGKJFKPVESDEESECEGCDGGGGAEEGVGEGIiKLFG 
VWLKGERKKRDRDE KNYWSGSRMTE I KNVDFHAPLWKS S KVCN* 
>G308 (196.. 1794) 
AGTAATTTAGTTTTTTTTTTTTTTOT 

AGTGAAAAAACAAATCCTAAGCAGTCCTAACCGATCCCCGAAGCTAAAGATTCTTCACCT 

TCCCAAATAAAGCAAAACCTAGATCCGACATTGAAGGAAAAACCTTTTAGATCCATCTCT 

GAAAAAAACCCAACC^TGAAGAGAGATCATO^TCATCAT.CATCAAGATAAGAAGACTATG 

ATGATGAATGAAGAAGACGACGGTAACGGCATGGATGAGCTTCTAGCTGTTCTTGGTTAC 

AAGGTTAGGTCATCGGAAATGGCTGATGTTGCTCAGAAACTCGAGCAGCTTGAAGTTATG 

ATGTCTAATGTTCAAGAAGACGATCTTTCTCAACTCGCTACTGAGACTGTTCACTATAAT 

CCGGCGGAGCTTTACACGTGGCTTGATTCTATGCTCACCGACCTTAATCCTCCGTCGTCT 

AACGCCGAGTACGATCTTAAAGCTATTCCCGGTGACGCGATTCTCAATCAGTTCGCTATC 

GATTCGGCTTCTTCGTCTAACCAAGGCGGCGGAGGAGATACGTATACTACAAACAAGCGG 

TTGAAATGCTCAAACGGCGTCGTGGAAACCACCACAGCGACGGCTGAGTCAACTCGGCAT 

GTTGTCCTGGTTGACTCGCAGGAGAACGGTGTGCGTCTCGTTCACGCGCTTTTGGCTTGC 

GCTGAAGCTGTTCAGAAGGAGAATCTGACTGTGGCGGAAGCTCTGGTGAAGCAAATCGGA 

TTCTTAGCTGTTTCTCAAATCGGAGCTATGAGACAAGTCGCTACTTACTTCGCCGAAGCT 

CTCGCGCGGCGGATTTACCGTCTCTCTCCGTCGCAGAGTCCAATCGACCACTCTCTCTCC 

GATACTCTTCAGATGCACTTCTACGAGACTTGTCCTTATCTCAAGTTCGCTCACTTCACG 

GCGAATCAAGCGATTCTCGAAGCTTTTCAAGGGAAGAAAAGAGTTCATGTCATTGATTTC 

TCTATGAGTCAAGGTCTTCAATGGCCGGCGCTTATGCAGGCTCTTGCGCTTCGACCTGGT 

GGTCCTCCTGTTTTCCGGTTAACCGGAATTGGTCCACCGGCACCGGATAATTTCGATTAT 

CTTCATGAAGTTGGGTGTAAGCTGGCTCATTTAGCTGAGGCGATTCACGTTGAGTTTGAG 

TACAGAGGATTTGTGGCTAAC^CTTTAGCTGATCTTGATGCTTCGATGCTTGAGCTTAGA 

CCAAGTGAGATTGAATCTGTTGCGGTTAACTCTGTTTTCGAGCTTCAGAAGCTCTTGGGA 

CGACCTGGTGCGATCGATAAGGTTCTTGGTGTGGTGAATCAGATTAAACCGGAGATTTTC 

ACTGTGGTTGAGCAGGAATCGAACCATAATAGTCCGATTTTCTTAGATCGGTTTACTGAG 

TCGTTGCATTATTACTCGACGTTGTTTGACTCGTTGGAAGGTGTACCGAGTGGTCAAGAC 

AAGGTCATGTCGGAGGTTTACTTGGGTAAACAGATCTGCAACGTTGTGGCTTGTGATGGA 

CCTGACCGAGTTGAGCGTCATGAAACGTTGAGTCAGTGGAGGAACCGGTTCGGGTCTGCT 

GGGTTTGCGGCTGCACATATTGGTTCGAATGCGTTTAAGCAAGCGAGTATGCTTTTGGCT 

CTGTTCAACGGCGGTGAGGGTTATCGGGTGGAGGAGAGTGACGGCTGTCTCATGTTGGGT 

TGGCACACACGACCGCTCATAGCCACCTCGGCTTGGAAACTCTCCACCAATTAGATGGTG 

GCTCAATGAATTGATCTGTTGAACCGGTTATGATGATAGATTTCCGACCGAAGCCAAACT 

AAATCCTACTGTTTTTCCCTTTGTCACTTGTTAAGATCTTATCTTTCATTATATTAGGTA 

ATTGAAAAATTTTAATCTCGCCTAAATTACT 

>G308 Amino Acid Sequence (domain in AA coordinates: 270-274) 
MKRDHHHHHQDKKTMMMNEEDDGNGMD 

EDDLSQIiATETVHYNPAEIjYTWLDSMIjTDIjNPPSSNAEYDLKAIPGDAILNQFAIDSASS 
SNQGGGGDT YTTNKRLKC SNG WETTTATAE S TRHWL VD S QENGVRLVHALLACAEAVQ 
KENLWAEALVKQXGFIiAVSQIGAMRQVATYFAEALARRI 

HFYETCPYIjKFAHFTANQAILEAFQGKKRVHVIDFSMSQGIjQWPAIjMQAIiAIjRPGGPPVF 
RLTGIGPPAPDNFDYIjHEVGCKLAHLAKAIHVEFEYRGF 

SVAWSVFELHIOiIiGRPGAIDKVIiGVWQIKPEIFTVVEQESNHNSPIFLDRFTESL 
STLFDSLEGVPSGQDKVMSEVYLGKQICNVVACDGPDRVERHETLSQWRN^ 
HIGSNAFKQASMLIJ^LFNGGEGYRVEESDGCLMLGWHTRPIiIATSAWKLSTN* 
>G38 (149.. 1156) 

GAGGAAAACTCGAAAAAGCTACACACAAGAAGAAGAAGAAAAGATACGAGCAAGAAGACT 
AAACACGAAAGCGATTTATC^CTCGAAGGAAGAGACTTTGATTT^ 

TATAGATTGTGTTGTTTCTGGGAAGGAGATGGCAGTTTATGATCAGAGTGGAGATAGAAA 
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CAGAACACAAATTGATACATCGAGGAAAAGGAAATCTAGAAGTAGAGGTGACGGTACTAC 

TGTGGCTGAGAGATTAAAGAGATGGAAAGAGTATAACGAGACCGTAGAAGAAGTTTCTAC 

CAAGAAGAGGAAAGTACCTGCGAAAGGGTCGAAGAAGGGTTGTATGAAAGGTAAAGGAGG 

ACCAGAGAATAGCCGATGTAGTTTCAGAGGAGTTAGGCAAAGGATTTGGGGTAAATGGGT 

TGCTGAGATCAGAGAGCCTAATCGAGGTAGCAGGCTTTGGCTTGGTACTTTCCCTACTGC 

TCAAGAAGCTGCTTCTGCTTATGATGAGGCTGCTAAAGCTATGTATGGTCCTTTGGCTCG 

TCTTAATTTCCCTCGGTCTGATGCGTCTGAGGTTACGAGTACCTCAAGTCAGTCTGAGGT 

GTGTACTGTTGAGACTCCTGGTTGTGTTCATGTGAAAACAGAGGATCCAGATTGTGAATC 

TAAACCCTTCTCCGGTGGAGTGGAGCCGATGTATTGTCTGGAGAATGGTGCGGAAGAGAT 

GAAGAGAGGTGTTAAAGCGGATAAGCATTGGCTGAGCGAGTTTGAACATAACTATTGGAG 

TGATATTCTGAAAGAGAAAGAGAAACAGAAGGAGCAAGGGATTGTAGAAACCTGTCAGCA 

ACAACAGCAGGATTCGCTATCTGTTGCAGACTATGGTTGGCCCAATGATGTGGATCAGAG 

TCACTTGGATTCTTCAGACATGTTTGATGTCGATGAGCTTCTACGTGACCTAAATGGCGA 

CGATGTGTTTGCAGGCTTAAATCAGGACCGGTACCCGGGGAACAGTGTTGCCAACGGTTC 

ATACAGGCCCGAGAGTCAACAAAGTGGTTTTGATCCGCTACAAAGCCTCAACTACGGAAT 

ACCTCCGTTTCAGCTCGAGGGAAAGGATGGTAATGGATTCTTCGACGACTTGAGTTACTT 

GGATCTGGAGAACTAAACAAAACAATATGAAGCTTTTTGGATTTGATATTTGCCTTAATC 

CCACAACGACTGTTGATTCTCTATCCGAGTTTTAGTGATATAGAGAACTACAGAACACGT 

TTTTTCTTGTTATAAAGGTGAACTGTATATATCGAAACAGTGATATGACAATAGAGAAGA 

CAACTATAGTTTGTTAGTCTGCTTCTCTTAAGTTGTTCTTTAGATATGTTTTATGTTTTG 

TAACAACAGGAATGAATAATACACACTTGTGAAGCTTTTAAAAAAAAAAAAAAAAAAAAA 

>G38 Amino Acid Sequence (domain in AA coordinates: 76-143) 

MAVYDQSGDRNRTQIDTSRKRKSRSRGDGTTVAERL 

SKKGOVIKGKGGPENSRCSFRGWQRIWGKWAEIREPITOGSRLWLGTFPTAQEAASAYDE 

AAKA^GPLARLNFPRSDASEVTSTSSQSEVCW 

MYCLENGAEEMKRGVKADKHT^SEFEHNYWSDILKEKEKQKEQGI^ 

DYGWPNDVDQSHIiDSSDMFDVDELIjRDLNGDDVFAGLNQDRYPGNSVANGSyRPESQQSG 

FDPLQSLNYGIPPFQLEGKDGNGFFDDLSYLDIiEN* 

>G43 (38.. 643) 

CTCCTGTCTTGTCTAAAGAAAAAAGAGAGAGGAAGAAATGGAGACTTTTGAGGAAAGCTC 
TGATTTGGATGTTATACAGAAACATCTATTTGAAGACTTGATGATCCCTGATGGTTTCAT 
TGAAGATTTTGTCTTTGATGATACTGCTTTTGTCTCCGGACTOTGGTCTCTAGAACCCTT 
TAACCCAGTTCCGAAACTGGAACCTAGTTCACCTGTTCTTGATCCAGATTCCTATGTCCA 
AGAGATTCTGCAAATGGAAGCAGAATCATCATCATCATCATC^CAACAACGTCACCTGA 
GGTTGAGACTGTCTCAAACCGGAAAAAAACAAAGAGGTTTGAAGAAACGAGACATTACAG 
AGGCGTGAGAAGGAGGCCATGGGGGAAATTTGCAGCAGAGATTCGAGATCCGGCAAAGAA 
AGGATCCAGGATTTGGTTAGGCACTTTTGAGAGTGATATTGATGCTGCAAGGGCTTACGA 
CTATGCAGCTTTTAAGCTCAGGGGAAGAAAAGCTGTTCTCAACTTTCCTTTGGATGCCGG 
AAAGTATGATGCTCCGGTCAATTCATGCCGAAAAAGGAGGAGAACCGATGTACCACAGCC 
TCAAGGAACAACAACAAGTACTTCATCATCGTCATCAAACTAATGGGGGAATAGTGATGT 
TTAATTAGTATATATAGGTTAATATCTTAAGTATGTGAAGCATCATGTATAGAGCCAAGA 
ACCTGTTAGACTAGTGTACTGAAAAGAACTCTTGCAAAATATGTACTAAAGAGTTCCTGT 
AACAATGGAACTTCTGCGTTTTCTCTTGTCTTAAAGAGCTTAAGGTTCTAGAAACAAAGT 
TCTTGTCCTTTCGGTTTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 
AAAAAAAAA ' 

>G43 Amino Acid Sequence (domain in AA coordinates: 104-172) 
METFEESSDLDVIQKHLFEDLMIPDGFIEDFVFDDTAFVSGLWSLEPFNPVPKLEPSSPV 
LDPDS YVQE ILQMEAES SSSS STTTS PEVETVSNRKKTKRFEETRHYRGVRRRPWGKFAA 
E IRDPAKKGSRI WIiGTFE SD IDAARAYDYAAFKLRGRKAVLNFPIiDAGKYDAPVNS CRKR 
RRTDVPQPQGTTTSTSSSSSN* 
>G536 (1..768) 

ATGTCGACAAGGGAAGAGAATGTTTACATGGCGAAATTAGCCGAACAAGCTGAACGTTAC 
GAAGAAATGGTTGAATTCATGGAGAAAGTTGCGAAAACTGTTGATGTTGAGGAACTTTCA 
GTTGAAGAGAGGAATCTTCTCTCTGTTGCTTACAAGAACGTGATTGGAGCGAGAAGAGCT 
TCGTGGAGAATCATTTCTTCGATTGAGCAGAAAGAAGAGAGCAAAGGGAACGAAGATCAT 
GTTGCTATTATCAAGGATTACAGAGGAGAGATTGAATCCGAGCTTAGCAAAATCTGTGAT 
GGGATTTTGAATGTTCTTGAAGCTCATCTTATTCCTTCTGCTTCACCAGCTGAATCTAAA 
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GTGTTTTATCTTAAGATGAAGGGTGATTATCATAGGTATCTTGCTGAGTTTAAGGCTGGT 

GCTGAAAGGAAAGAAGCTGCTGAAAGCACTTTGGTTGCTTACAAGTCTGCTTCCGACATT 

GCCACTGCTGAGTTAGCTCCTACTCACCCGATAAGGCTTGGTCTTGCACTCAACTTCTCT 

GTGTTTTACTATGA7UVTCCTCAACTCGCCTGATCGTGCTTGCAGCCTCGCAAAGCAGGCG 

TTTGATGATGCAATCGCTGAGTTAGATACATTGGGTGAGGAATCATACAAGGACAGTACA 

CTGATTATGCAGCTTCTTAGAGACAATCTCACTCTCTGGACTTCAGATATGACTGACGAA 

GCAGGAGATGAGATTAAGGAGGCATCAAAGCCCGATGGTGCCGAGTAA 

>G536 Amino Acid Sequence (domain in AA coordinates : 226-233) 

MSTREENVYMAKI^QAERYEEMVEFMEKVAKTTO 

SWRIISSIBQKEESKGNEDHVAIIKIDYRGEIESELSKICDGILlT\rLEAHLIPSASPAES 

VFYLKMKGDYHRYLAEFKAGAERKEAAESTLVAYKSASDI^ 

VFYYEILNSPDRACSLAKQAFDDAIAELDTI^EESY^ 

AGDEIKEASKPDGAE* 

>G567 (38.. 1273) 

AAAAAGAAGAATCAGAAAGTGAAAAAGAGAGCGAGCGATGAACAGTATCTTCTCCATTGA 
CGATTTCTCCGATCCTTTCTGGGAAACTCCTCCGATTCCTCTCAATCCCGACTCTTCTAA 
GCCTGTTACGGCGGATGAAGTTAGCCAGAGTCAACCGGAATGGACTTTCGAGATGTTTCT 
CGAAGAGATTT*CTTCGTCGGCGGTGAGCTCTGAGCCACTTGGTAACAACAACAACGCGAT 
CGTCGGTGTTTCTTCGGCGCAATCTCTTCCTTCTGTTTCCGGACAGAATGATTTCGAGGA 
TGATAGTCGATTTCGTGATCGCGATTCGGGAAATTTGGATTGTGCTGCTCCCATGACGAC 
GAAGACGGTGAATGTTGATTCCGATGATTATCGTCGTGTTCTTAAGAACAAGCTTGAGGC 
TGAGTGCGCGACTGGTGTTTCTCTTCGGGTTGGGTCTGTGAAGCCTGAAGATTCGACTAG 
TTCTCCAGAAACTCAACTTCAACCAGTTCAATCCAGTCCTCTTACTCAAGGAGAACTTGG 
TGTTACTTCTTCCTTACC^GCTGAGGTGAAAAAAACTGGTGTATGAATGAAGCAGGTTAC 
TAGTGGATCGTCGAGAGAATATTCTGATGACGAGGACCTTGATGAAGAGAATGAAACCAC 
CGGTTCCTTGAAGCCAGAGGACGTTAAAAAATCTAGAAGGATGCTGTCAAATCGTGAGTC 
AGCTAGGCGATCTAGAAGGAGAAAGCAGGAGCAAACAAGTGACCTCGAAACACAGGTTAA 
TGATCTAAAAGGTGAGCATTCATCACTTCTTAAACAACTGAGCAACATGAATCAC^VAGTA 
TGACGAGGCTGCTGTTGGCAATAGAATACTAAAGGCTGACATTGAGACATTAAGAGCTAA 
GGTGAAAATGGCGGAAGAAACCGTGAAGAGAGTAACAGGAATGAATCCGATGCTTCTCGG 
AAGA.TCAAGTGGACATAACAACAACAACAGAATGCCAA 

TTCTTCTAGCATTATTCCAGCTTATCAACCACACTCAAACCTAAACCATATGTCAAACCA 
AAACATCGGGATCCCAACCATTCTACCTCCAAGACTCGGAAACAATTTCGCTGCTCCTCC 
ATCCCAAA.CCAGCTCTCCCTTGCAGAGAATTAGAAATGGGC?^AATCACCATGTTACTCC 
AAGCGCCAACCCGTATGGCTGGAATACCGAACCTCAGAACGATTCAGCATGGCCGAAAAA 
ATGCGTGGACTGATCAAACAAGAAGCGGGTTTCGCACTATATTAATGTCTATGCATCTGT 
AATTTGTAAGTGTTATTAAGTTACGAATCATGAGAAAACATCTTGTGAAAATACAGTCTC 
ATGGCTTATATATATATATAAGCTCTGTCTTATAACATTACAAGATTCTTATTTGAGAAT 
CGTCTTTCTATTTATAGCTAATAAAAAAAAAAAAAAAAA 

>G567 Amino Acid Sequence (domain in AA cordinates 210-270) 

MNSIFSIDDFSDPFWETPPIPLNPDSSKPVTADEVSQSQPEWTFEMFLEEISSSAVSSEP 

LGNNNNAIVGVSSAQSLPSVSGQND^ 

VLKNKLEAECATGVSLRVGSVKPEDSTS SPETQLQPVQSSPLTQGEIiGVTS SLPAEVKKX 
GVSMKQVTSGSSREYSDDEDLDEEl^TTGSLKPEDyKKSRRMIjSNRESARRSRRRKQEQT 
SDLETQVNDIiKGEHSSIiIjKQIjSNMNHKYDEAAVGNRILKADI 
GMNPI^LGRSSGHNNNIH^P 

GNNFAAPPSQTSSPLQRIRNGQNHHVTPSANPYGWNTEPQNDSAWPKKCVD* 
>G680 (338.. 2275) 

CAGTTATCTTCTTCCTTCTTCTCTCTGTTTTTTAAATTTATTTTTAGAGAATTTTTTTTG 
TTTTGCTTCCGATTTGATTATTTCCGGGAACGATGACTTCTCCGGGGAGTTCCCGGTGAG 
ATGATAAGTCAGATTGCATACTTGTCTCCTCCATGGCTACTCTCAAGGGTTTTGGCTGCG 
GTGGATTCGTTTGGTTTCTCTAGAATCTAAAGAGGTTATCACAACGGCTTTGCAATTTGA 
AAACTTTCATGTTTGGGGAGATCAAAGATGGTTTCTTTTTTATACTTTACTTGTTAGAGA 
GGATTTGAAGCAGCGAATAGCTGCAACCGGTCCTGTTATGGATACTAATACATCTGGAGA 
AGAATTATTAGCTAAGGCAAGAAAGCCATATACAATAACAAAGCAGCGAGAGCGATGGAC 
TGAGGATGAGGATGAGAGGTTTCTAGAAGCCTTGAGGCTTTATGGAAGAGCTTGGCAACG 
AATTGAAGAACATATTGGGACAAAGACTGCTGTTCAGATCAGAAGTCATGCACAAAAGTT 
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CTTCACAAAGTTGGAGAAAGAGGCTGAAGTTA7VAGGCATCCCTGTTTGCCAAGCTTTGGA 
CATAGAAATTCCGCCTCCTCGTCCTAAACGAAAACCCAATACTCCTTATCCTCGAAAACC 
TGGGAACAACGGTACATCTTCCTCTCAAGTATCATCAGCAAAAGATGCAAAACTTGTTTC 
ATCGGCCTCTTCTTCACAGTTGAATCAGGCGTTCTTGGATTTGGAAAAAATGCCGTTCTC 
TGAGAAAACATCAACTX^AAAAGAAAATCAAGATGAGAATTGCTCGGGTGTTTCTACTGT 
GAACAAGTATCCCTTACCAACGAAACAGGTAAGTGGCGACATTGAAACAAGTAAGACCTC 
AACTGTGGACAACGCGGTTCAAGATGTTCCCAAGAAGAACAAAGACAAAGATGGTAACGA 
TGGTACTACTGTGCACAGCATGCAAAACTACCCTTGGCATTTCCZACGCAGATATTGTGAA 
CGGGAATATAGC?^AAATGCCCTCAAAATCATCCCTCAGGTATGGTATCTCAAGACTTCAT 
GTTTCATCCTATGAGAGAAGAAACTCACGGGCACGCAAATCTTCAAGCTACAACAGCATC 
TGCTACTACTACAGCTTCTCATCAAGCGTTTCCAGCTTGTCATTCACAGGATGATTACCG 
TTCGTTTCTCCAGATATC^TCTACT^ 

TCCTGCAGCTCATGCTGCAGCTACATTCGCTGCTTCGGTCTGGCCTTATGCGAGTGTCGG 
GAATTCTGGTGATTCATCAACCCCAATGAGCTCTT 

CGCTGCTACAGTAGCTGCTGCAACTGCTTGGTGGGCTTCTCATGGACTTCTTCCTGTATG 
CGCTCCAGCTCCAATAACATGTGTTCCATTC^ 

GACTGAAATGGATACCGTTGAAAATACTCAACCGTTTGAGAAACAAAACACAGCTCTGCA 
AGATCAAACCTTGGCTTCGAAATCTCCAGCTTCATCATCTGATGATTCAGATGAGACTGG 
AGTAACCAAGCTAAATGCCGACTCAAAAACCAATGATGATAAAATTGAGGAGGTTGTTGT 
TACTGCCGCTGTGCATGACTCAAACACTGCCCAGAAGAAAAATCTTGTGGACCGCTCATC 
GTGTGGCTCAAATACACCTTCAGGGAGTGACGCAGAAACTGATGCATTAGATAAAATGGA 
GAAAGATAAAGAGGATGTGAAGGAGACAGATGAGAATCAGCCAGATGTTATTGAGTTAAA 
TAACCGTAAGATTAAAATGAGAGACAACAACAGCAACAACAATGCAACTACTGATTCGTG 
GAAGGAAGTCTCCGAAGAGGGTCGTATAGCGTTTCAGGCTCTCTTTGCAAGAGAAAGATT 
GCCTCAAAGCTTTTCGCCTCCTCAAGTGGCAGAGAATGTGAATAGAAAACAAAGTGACAC 
GTCAATGCCATTGGCTCCTAATTTCAAAAGCCAGGA 

AGTAGTAATGATCGGTGTTGGAACATGCAAGAGTCTTAAAACGAGACAGACAGGATTTAA 

GCC^TACAAGAGATGTTCAATGGAAGTGAAAGAGAGCCAAGTTGGGA^ 

AAGTGATGAAAAAGTCTGCAAAAGGCTTCGATTGGAAGGAGAAGCTTCTACATGACAGAC 

TTGGAGGTAAAAAAAAAA(^TCCACATTTTTATGAATATCTTTAAATCTAGTGTTAGTAG 

TTTGCTTCTCCAATCTTTATGAAAGAGACTC 

CATGTCAGGTTCTGTACCATATTACCCCATGTCTTGTCTCTTGTCTCTGTTTGTGTATGC 
TACTTGTGGTCTATATGTCATCTGCTACTACTGTTAATTAACCATTAAGCAATGGATTTG 
TCTTTA 

>G680 Amino Acid Sequence (domain in AA coordinates: 24-70) 
MDTICTSGEELIiAKAR 

IRSHAQKFFTKLEKEAEVKGIPVCQAIjDIEIPPPRPKRKPNTPYPRKPGNNGTSSSQVSS 
AKDAKLVS SAS S S QLNQAFLDLEKMPFSE KTSTGKENQDENCSGVSTVNKyPLPTKQVS G 
DIETSKTSTVDNAVQDVPKKNKDira^ 

GMVS QDFMFHPMREETHGHANLQATTAS ATTTASHQAFPACHSQDDYRSFLQ I S S TFSNL 
IMS TLIjQNPAAHAAATFAAS VWPYAS VGNSGDS STPMSS S PPS ITAI AAATVAAATAWWA 
SHGLLPVGAPAP ITCVPFSTVAVPTPAMTEI^TVENTO KS PAS S 

SDDSDETGVTKLJSrADSKTNDDKIEEVVWAAVHDSNTA 
TDAIiDKMEKI3KEDVKETDENQPDVI^ 

ALFARERIjPQSFSPPQVAENVNRKQSDTSMPLAPNFKSQDSCAADQEGVVMIGVGTCKSL 

KTRQTGFKPYKRCSMEVKESQVGNINNQSDEKVCKRLRLEGEAST* 

>G867 (64.. 1098) 

CACAACACAAAGACATTTCTGTTTTCTC 

TAAATGGAATCGAGTAGCGTTGATGAGAGTACTACAAGTACAGGTTCCATCTGTGAAACC 
CCGGCGATAACTCCGGCGAAAAAGTCGTCGGTAGGTAACTTATACAGGATGGGAAGCGGA 
TCAAGCGTTGTGTTAGATTCAGAGAACGGCGTAGAAGCTGAATCTAGGAAGCTTCCGTCG 
TCAAAATACAAAGGTGTGGTGCCACAACC3UUICGGAA 

AAACACCAGCGCGTGTGGCTCGGGACATTCAACGAAGAAGACGAAGCCGCTCGTGCCTAC 
GACGTCGCGGTTCACAGGTTCCGTCGCCGTGACGCCGTCACAAATTTCAAAGACGTGAAG 
ATGGACGAAGACGAGGTCGATTTCTTGAATTCTCATTCGAAATCTGAGATCGTTGATATG 
TTGAGGAAACATACTTATAACGAAGAGTTAGAGCAGAGTAAACGGCGTCGTAATGGTAAC 
GGAAACATGACTAGGACGTTGTTAACGTCGGGGTTGAGTAATGATGGTGTTTCTACGACG 
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GGGTTTAGATCGGCGGAGGCACTGTTTGAGAAAGCGGTAACGCCAAGCGACGTTGGGAAG 

CTAAACCGTTTGGTTATACCGAAACATCACGCAGAGAAACATTTTCCGTTACCGTCAAGT 

AACGTTTCCGTGAAAGGAGTGTTGTTGAACTTTGAGGACGTTAACGGGAAAGTGTGGAGG 

TTCCGTTACTCGTATTGGAACAGTAGTCAGAGTTATGTTTTGACTAAAGGTTGGAGCAGG 

TTCGTTAAGGAGAAGAATCTACGTGCTGGTGACGTGGTTAGTTTCAGTAGATCTAACGGT 

CAGGATCAACAGTTGTACATTGGGTGGAAGTCGAGATCCGGGTCAGATTTAGATGCGGGT 

CGGGTTTTGAGATTGTTCGGAGTTAACATTTCACCGGAGAGTTCAAGAAACGACGTCGTA 

GGAAACAAAAGAGTGAACGATACTGAGATGTTATCGTTGGTGTGTAGCAAGAAGCAACGC 

ATCTTTCACGCCTCGTAACAACTCTTCTTCTTTTTTTTTCTTTTGTTGTTTTAATAATTT 

TTAAAAACTCCATTTTCGTTTTCTTTATTTGCATCGGTTTCTTTCTTCTT 

GGTTCATGAGTTGTTTTTGTTGTATTGATGAACTGTAAATTTTATTTATAGGATAAATTT 

TAAAAAAAAAAAAAAAAAAAA 

>G867 Amino Acid Sequence (domain in AA coordinates: 59-124) 

MESSSVDESTTSTGSICETPAITPAKKSSVGNLYI^GSGSSVVLDSENGVEAESRKLPSS 

KTKGWPQPNGRWGAQIYEKHQRWLGTFNEED^ 

DEDEVDFLNSHSKSErTOMLRKHTYNEELEQSKR^ 

FRSAEALFEKAVTPSDVGKLXreiiVIPKHHM 

RYSYWNSSQSYVLTKGWSRF\^CEKlSrLRAGDVVSFSRSNGQI)QQLYIGWKSRSGSDIjDAGR 
VLRLFGVNI S PES SRNDVVGNKRVNDTEML SLVCSKKQRI FHAS * 
>G956 (1. .840) 

ATGGAGGAGACAGAAAAGAATAAGGGCAGCATAAGTATGGTTGAGGCTAATCTACCTCCT 

GGTTTTAGATTCCATCCTAGAGACGACGAGCTCGTCTGTGACTACTTAATGAGAAGAACC 

GTTCGCAGCCTCTATCAACCAGTTGTCTTGATCGACGTCGATCTTAACAAATGCGAGCCT 

TGGGACATTCCTCAAACGGCGAGAGTGGGAGGGAAAGAATGGTACTTTTACAGCCAAAAA 

GACCGTAAATACGCAACAGGCTACAGAACAAACCGGGCTACGGCCACCGGTTATTGGAAA 

GCCACCGGGAAAGATAGAGCAATCCAAAGAAACGGTGGTCTTGTGGGTATGAGAAAGACA 

CTTGTGTTTTACCGAGGTCGATCCCCTAAAGGTCGTAAAACTGATTGGGTCATGCATGAG 

TTTCGTCTCCAAGGAAAACTTCTTCACCACTCCCCTAATTCTCTCGAGGAAGAGTGGGTA 

TTGTGTAGAGTTTTCCACAAGAACAGCAACGGAGCTGATATAGACGACATCACAAGGAGC 

TGCTCTGATGCAACAGCTTCTGCATTG 

ATCATCAATCAGCATGTACCCTGCTTCTCCAATAATTTGT 

TCCGGTTTAATCTCCAAGAACTCC^GCCCATTGTTTAATGCTTCCCCTGATCAAATGATT 

CTCAGAACTTTGCTAAGTCAACTCACAAAAATy^GTCGAAGAATCACAGAGTCGTGGAGAC 

GGAAGCTCAGAGAGCCAATTGACCGACATTGGCATCCCAAGCCATGCATGGAATTACTGA 

>G956 Amino Acid Sequence (domain in AA coordinates : TBD) 

MEETEKNKGSISMVEANIjPPGFRFHPRDDELVCDYIjMRRTVRSLYQPVVIjIDVDLNKCEP 

WDIPQTARVGGKEWYFYSQKDRKYATGYRTNRATATGYWKATGKDItAIQRNGGLVGMRKT 

LVF YRGRS PKGRKTDWVMHEFRLQGKLIiHHS PNSLEEEWVLCRVFHKNSNGAD IDDITRS 

CSDATASAFMDSYINFDHHHIINQHVPCFSNNLiSHNQTNQSGLISKNSSPLFNASPDQMI- 

IiRTLLSQLTKKVEESQSRGDGSSESQIiTDIGIPSHAWNY* 

>G996 (53.. 1063) 

CGATCGATCTTGAATTGATTCTTTGTAGTATTTTATTTACATATATATATAGATGGGAAG 
ACATTCATGTTGTTACAAACAGAAACTGAGGAAAGGACTTTGGTCTCCTGAAGAAGATGA 
GAAGCTTCTTCGTTACATCACTAAGTATGGTCATGGTTGCTGGAGCTCTGTGCCTAAACA 
AGCTGGTTTACAGAGATGTGGAAAAAGTTGTAGATTAAGATGGATAAATTATTTJVAGACC 
AGATTTGAAGAGAGGAGCATTTTCTCAAGATGAAGAAAATCTCATTATTGAACTTCATGC 
CGTTCTTGGCAATAGATGGTCTCAGATAGCTGCACAGCTTCCTGGAAGAACCGACAATGA 
AATCAAGAATCTTTGGAATTCTTGTTTGAAGAAGAAATTGAGGCTGAGAGGAATTGACCC 
GGTTACACACAAGCTCTTAACCGAAATCGAAACCGGTACAGATGACAAAACAAAACCGGT 
TGAGAAGAGTCAACAGACCTACCTCGTTGAGACTGATGGCTCCTCTAGTACCACTACTTG 
TAGTACTAACCAAAACAACAACACTGATCATCTTTATACCGGAAATTTCGGTTTTCAACG 
GTTAAGTCTAGAAAACGGTTCAAGAATCGCAGCCGGTTCTGACCTCGGTATCTGGATTCC 
CCAAACCGGAAGAAACCATCATCATCATGTCGATGAAACCATCCCTAGTGCAGTGGTACT 
ACCCGGTTCAATGTTCTC^TCCGGTTTAACCGGTTATAGATCCTCCAATCTCGGTTTAAT 
TGAATTGGAAAACTCATTCTCAACCGGGCCAATGATGACAGAGCATCAGCAAATTCAAGA 
GAGTAACTACAAG?U^TTGAACATTCTTTGGAAATGGGAATCTGAATTGGGGATTAACIAAT 
GGAGGAAAATCAAAATCCATTCACAATATCGAATCATTGAAATTCGTCCTTATACAGTGA 
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TATAAAATCAGAGACCAATTTTTTTGGCACAGAGGCTACAAATGTTGGTATGTGGCCATG 
TAACCAGCTTCAGCCTCAGCAACATGCATATGGCCATATATAAATCTTCTTGTATATTAT 

AA . 

>G996 Amino Acid Sequence (domain in AA coordinates: 14-114) 

MGRHSCCYKQKLRKGLWSPEEDEKLLRYITKYGHGCWSSVPKQAGLQRCGKSCRLRWINY 

LRPDLKRGAFSQDEENIjIIELHAVLGNRWSQIAAQL^ 

IDPVTHKLLTE I ETGTDDKTKPVEKSQQTYLVETDG SS STTTCSTNQNWNTDHLYTGNFG 

FQRLSLENGSRIAAGSDLGIWIPQTGRNHHHHVDETIPSAVVLPGSMFSSGLTGYRSSNIj 

GLIELENSFSTGPMMTEHQQIQESNYI^STFFGNGIIIjNWGLTMEEIJQNPFTISNHSNSSIj 

YSDIKSETNFFGTEATNVGMWPCNQIjQPQQHAYGHI* 

>G1946 (90.. 1547) 

TCTCACCTATTGTAAAAATCACCAGTTTCGTATATAAAACCCTAATTTTCTCAAAATTCC 
CAAATATTGACTTGGAATCAAAAATCCGAATGGATGTGAGCAAAGTAACCACAAGCGACG 
GCGGAGGAGATTCAATGGAGACTAAGCCATCTCCTCAACCTCAGCCTGCGGCGATTCTAA 
GTTCAAACGCGCCTCCTCCGTTTCTGAGCAAGACCTATGATATGGTTGATGATCACAATA 
CAGATTCGATTGTCTCTTGGAGTGCTAATAACAACAGTTTTATCGTTTGGAAACCACCGG 
AGTTCGCTCGCGATCTTCTTCCTAAGAACTTTAAGCATAATAATTTCTCCAGCTTCGTTA 
GACAGCTTAATACCTATGGTTTCAGGAAGGTTGACCCAGATAGATGGGAATTTGCGAATG 
AAGGTTTTTTAAGAGGTCAGAAGCACTTGCTACT^ATCAATAACTAGGCGAAAACCTGCCC 
ATGGACAGGGACAGGGACATCAGCGATCTCAGCACTCGAATGGACAGAACTCATCTGTTA 
GCGCATGTGTTGAAGTTGGCAAATTTGGTCTCG2\AGAAGAAGTTGAAAGGCTTAAAAGAG 
ATAAGAACGTCCTTATGCAAGAACTCGTCAGATTAAGACAGCAGCAACAGTCCACTGATA 
ACCAACTTCAAACGATGGTTCAGCGTCTCCAGGGCATGGAGAATCGGCAACAACAATTAA 
TGTCATTCCTTGCAAAGGCAGTACAAAGCCCTGA^ 

AGAATCAGCAAAACGAGAGTAATAGGCGCATCAGTGATACCAGTAAGAAGCGGAGATTCA 
AGCGAGACGGCATTGTCCGTAATAATGATTCTGCTACTCCTGATGGACAGATAGTGAAGT 
ATCAAC CTC CAATGCACGAGCAAGC CAAAGCAATGTTT AAACAGCTTATGAAG ATGGAAC 
CTTACAAAACCGGCGATGATGGTTTCCTTCTAGGTAATGGTACGTCTACTACCGAGGGAA 
CAGAGATGGAGACTTC^TCAAACCAAGTATCGGGTATAACTCTTAAGGAAATGCCTACAG 
CTTCTGAGATACAGTCATCATCACCAATTGAAACAACTCCTGAAAATGTTTCGGCAGCAT 
(^GAAGCAACCGAGAACTGTATTCCTT<^CCTG^ 

ATATGCTACCGGAAAATAATTCAGAGAAGCCTCCAGAGAGTTTCATGGAACCAAACCTGG 
GAGGTTCTAGTCCATTACTAGATCCAGATCTGTTGATCGATGATTCTTTGTCCTTCGACA 
TTGACGACTTTCCAATGGATTCTGATATAGACCCTGTTGATTACGGTTTACTCGAACGCT 
TACTCATGTCAAGCCCGGTTCCAGATAATATGGATTCAACACCAGTGGACAATGAAACAG 
AGCAGGAACAAAATGGATGGGACAAAACTAAGCATATGGATAATCTGACTCAACAGATGG 
GTCTCCTCTCTCCTGAAACCTTAGATCTCTCAAGGCAAAATCCTTGATTTTGGGAGTTTT 
TAAAGTCTTTTGAGGTAACACAGTCCCTGAGAGCAGCATATTCAT 

>G1946 Amino Acid Sequence (domain in AA coordinates: 32-130) 

MDVSKVTTSDGGGDSMETKPSPQPQPAAIIjSSNAPPPFIiSKTYDMVDDHNTDSIVSWSAN 

NNSFIVWKPPEFARDLLPKNFKEQ^FSSFVRQIjNTYGFRKVDPDRW 

IiQS ITRRKPAHGQGQGHQRSQHSNGQNS SVSACVEVGKFGLEEEVERLKRDKIsIVLMQELV 
RLRQQQQSTDNQLQTMVQRXiQGMENRQQQLMSFIiAKAVQS phfls qflqqqnqqne snrr 
ISDTSKKRRFKRDGIVRlflroSATPDGQIVKYQPP 

liGNGTSTTEGTEMETSSNQVSGITLKEMPTASEIQSSSPIETTPENVSAASEATENCIPS 
PDDLTLPDFTHMLPENNSEKPPESFMEPl^GGSSPI^ 

DPVDYGLLERLLMS S PVPDNMDSTPVDNETEQEQNGWDKTKHMDNIjTQQMGIjIjS PETLDL 
SRQNP* *- 
>G217 (84 . .2618) 

cttcgttcttaccgadfttccacgagcattagcttcagagaccttgaattggagtgcggtt 
ggatcaaaaacagttgagcgaagatgaggattatgattaagggaggtgtttggaagaaca 
ccgaagatgagattctcaaagccgccgtgatgaagtatggtaagaaccaatgggctcgga 
tctcgtctcttctcgttcgtaagtctgctaaacagtgtaaagctcgctggtacgagtggc 
tcgatccatctatcaaaaagactgaatggaccagagaagaagatgagaagcttctacatc 
ttgctaaacttctgcctactcaatggagaactattgctcctattgtgggtcgtacaccat 
ctcaatgtcttgagaggtatgagaagctccttgatgcagcatgcactaaggatgaaaatt 
atgatgcagcggatgatccacgaaaattacgtcctggtgagattgatccgaacccagaag 



199 



WO 03/013227 



200/286 



caaagcctgctcgtcctgatccggtagacatggacgaagatgagaaagaaatgctttctg 

aagcaagagctagattggctaacacgaggggaaagaaggctaaaagaaaagctagagaaa 

aacaacttgaggaagctagaaggcttgcttctctgcaaaaaagaagagaactaaaagcag 

ctgggattgatggaaggcataggaaaagaaagagaaagggaatcgactataatgcagaaa 

ttccttttgaaaagagggcacctgcgggattttatgatactgcggatgaagatcgtcctg 

ctgatcaagtaaaatttccaactaccattgaagaacttgaaggaaaaagaagagctgatg 

tagaagcacatttacgcaaacaagatgttgcaaggaataaaattgctcagagacaggatg 

ctccagcagctatattgcaagcaaacaagctgaatgatccggaagttgttaggaagaggt 

caaagctgatgttaccaccaccgcagatttcagaccacgagctagaagaaattgctaaga 

tgggctatgccagtgaccttcttgccgagaatgaggagctaacagaaggcagtgctgcta 

ctcgtgcacttttggcaaattactcacaaacaccaaggcaaggaatgacacccatgagga 

cacctcaaagaactcctgctggtaaaggtgatgctattatgatggaagcagaaaacctgg 

ccagattaagagactctcagacacctttgctaggaggagaaaatcctgagttgcaccctt 

ctgacttcactggggtcactccgagaaagaaggagattcaaacgcctaatccaatgttga 

ccccttcaatgactcctggtggtgctggtcttactccaagaattggcttgacgccatcaa 

gggatgggtcttctttttctatgacacccaaagggactcccttcagggatgaacttcaca 

ttaacgaagacatggacatgcagcaaagtgcaaaacttgagaggcagagacgagaggaag 

ctagaaggagtttacgctctggtttgactgggcttcctcagccaaagaacgagtaccaaa 

tagttgcacaacctcctcctgaggaaagtgaagagccagaagagaaaattgaggaagaca 

tgtcagacaggatagcgagggaaaaggcggaggaagaagcaagacaacaggcattgctta 

agaagagatccaaggtcttgcagagagatcttcctagacccccagctgcttcattggcag 

taattaggaactcgttgctttcagctgatggagacaaaagttctgttgttcctcctactc 

cgattgaggttgcagataaaatggtaagagaggagcttctacagttgctggagcatgata 

acgcaaagtatccgcttgatgacaaagctgagaagaagaaaggagccaagaaccgtacca 

accgttctgcttctcaagttcttgcaattgacgattttgatgaaaatgagctccaagagg 

cfcgacaaaatgataaaggaggaggggaagtttctgtgtgtgtcaatgggacatgagaaca 

agacacttgatgattttgtagaagctcacaacacatgcgtgaatgatctcatgtatttcc 

ccactcgaagcgcttacgagctctcaagtgttgctgggaacgcggacaaagttgcagctt 

ttcaggaggagatggagaatgtgagaaaaaagatggaggaggatgagaagaaggcagaac 

acatgaaggccaagtacaaaacttatacaaagggtcatgagaggagggcagagaccgtgfc 

ggacccaaatagaggcgacattgaagcaggctgagattggtggaacagaagtagagtgct 

ttaaagcattgaagaggcaagaagagatggctgcatcttttaggaaaaagaatttgcaag 

aggaagtgataaagcaaaaggaaacagagagtaaactgcagactcgctatgggaatatgt 

tggcaatggttgaaaaagcagaggagataatggtcggtttccgagcacaggcattgaaga 

aacaagaggatgttgaagattctcacaaactgaaagaagctaagctagccactggagagg 

aagaggacatagccatagccatggaagcttctgcataaaaacttgagttttgtattgctt 

acaagttttaaggagacgtagcttgactttgtattggtaagtttttttaatatgagtcat 

gactttgtaaaaaggttatgatatattctctgtttgtatgctttgcaagagtcaagaaat 

ttgaatgcttcaggatcaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 

>G217 Amino Acid Sequence (conserved domain in AA coordinates 

MR I M I KGGVWKNTEDE I LKAAVM KYGKNQ W AR I S SLLVRKSAKQCKARWYEWLDPS I KKT 

EWTREEDEKLIiHIiAKIjLPTQWRTI^ 

KLRPGEIDPNPEAKPARPDPVDMDEDEKEMIiSEARARIiANTRGK^ 

loASLQKRRELKAAGIDGRHRKRKRKGIDYNAEIPFEKRAPAGFYDTADEDRPAD 

T IEELEGKRRADVEAHLRKQDVARNKIAQRQDAPAAILQAinajl^PEVVRKRS KLMLPPP 

QI SDHELEE I AKMGYASDLLAENEELTEGSAATRALLANYS QTPRQGMTPMRTPQRTPAG 

KGDAIMMEAEOTiARLRDSQTPLLGGENPEIiHPSDFTGVTPRKKEIQTPNPMLTPSMTPGG 

AGIiTPRIGIiTPSRDGSSFSMTPKGTPFRDEIiHINEDMDMQQSAKLERQRREEARRSLRSG 

LTGLPQPKlSJEYQIVAQPPPEESEEPEEKIEEDMSDRIAREKAEEEARQQALIiKKRSKV^ 

RDLPRPPAASLAVIRNSIiLSADGDKSSWPPTPIEVADKMVREELLQLLEH^ 

KAEKKKGT^KNRTNRSASQVLAIDDFDENELQEADKMIKEEGKFLCV 

AHNTCA/NDLMYFPTRSAYELSSVAGNADKVAAFQEEI^ 

YTKGHERRAETVWTQIE ATLKQAE I GGTEVECFKALKRQEEMAAS FRKKNLQEEVI KQKE 

TESKLQTRYGNMLAMVEKAEEI^^ 

EASA* 

>G2192 (92.. 2971) 

CGGAAAGAGATCAACCAACGATAGAGGAGAAGAAGAACTTGCATACGCAAAAAAACTTTC 
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CCGGGAAAATTCCAGAAACTGCTTTGGAAAAATGTGCGAGCCCGATGATAATTCCGCTAG 
AAACGGCGTCACTACTCAACCTTCGAGGTCAAGGGAGCTTCTAATGGATGTTGACGACTT 
AGATCTTGACGGTTCATGGCCACTAGATCAAATCCCTTACTTATCCTCATCGAATCGCAT 
GATTTCTCCGATTTTTGTCTCCTCTTCCTCTGAGCAGCCTTGCTCGCCTCTCTGGGCTTT 
CTCCGACGGTGGAGGAAATGGTTTTCACCACGCAACCTCCGGTGGCGATGATGAGAAGAT 
CAGCTCTGTCTCCGGTGTTCCTTCTTTeCGTCTCGCCGAGTATCCTCTCTTCCTCCCTTA 
CTCTTCTCCATCAGCAGCTGAGAACACAACAGAGAAGCATAACAGTTTCCAGTTTCCGTC 
TCCATTGATGAGCCTAGTCCCACCAGAGAACACAGACAACTACTGTGTGATCAAAGAGAG 
GATGACTCAGGCGCTTCGATACTTCAAAGAATGAACCGAAC^^ 

CTGGGCTCCTGTGAGAAAGAATGGTCGTGATTTGCTGACGACTTTGGGTCAACCTTTTGT 
TCTTAATC CTAATGGTAATGGG CTTAATCAATACAGGATGATCTCTCTCACATATATGTT 
TTCTGTGGATAGTGAAAGTGACGTAGAGCTCGGACTCCCGGGTCGAGTTTTCCGTCAGAA 
ATTGCCTGAATGGACTCGAAATGTTCAGTACTATTCC^ 

TCACGCCTTGCACTACAACGTGCGTGGTACACTGGCCTTGCCTGTCTTTAATCCCTCTGG 

TCAGTCCTGCATAGGTGTTGTGGAACTTATAATGACCTCAGAGAAGATTCACTATGCACC 

CGAAGTGGACAAAGTTTGCAAAGCCCTTGAGGCGGTAAATCTGAAAAGCTCGGAAATACT 

TGATCACCAAACAACACAGATATGCAATGAGAGTCGCCAAAACGCGCTTGCTGAGATTCT 

CGAAGTGCTGACAGTTGTATGTGAGACCCATAACTTGCCTCTCGCTCAGACTTGGGTTCC 

ATGTCAGCATGGGAGCGTTCTTGCCAATGGTGGCGGTCTAAAGAAAAACTGCACGAGCTT 

TGACGGTAGCTGCATGGGTCAAATCTGCATGTCTACAACCGACATGGCCTGCTATGTCGT 

GGATGCTCATGTCTGGGGCTTTAGAGATGCCTGTCTTGAACACCATCTCCAGAAAGGCCA 

GGGAGTCGCTGGACGAGCTTTTCTCAATGGTGGCTCATGTTTCTGCAGAGACATCACCAA 

GTTCTGCAAAACGCAGTACCCACTAGTCCATTATGCGCTGATGTTCAAGTTGACCACTTG 

TTTTGCAATATCTCTCCAGAGCTCTTACACGGGCGACGACAGTTACATTCTTGAATTTTT 

TCTTCCTTCGAGTATAACAGACGACCAAGAGCAAGATTTGCTGTTGGGTTCTATTTTGGT 

GACAATGAAAGAACATTTTCAGAGTCTGAGGGTTGCATCTGGGGTTGACTTTGGTGAAGA 

TGACGACAAATTGTCTTTCGAGATCATCCAAGC3ATTACCGGACAAGAAGGTTCA 

AATAGAATCCATTCGAGTTCCCTTTTCTGGTTTTAAGTCAAATGCAACAGAGACGATGTT 

GATTCCTCAGCCTGTGGTTCAGTCTTCTGATCCAGTAAATGAGAAAATCAACGTGGCCAC 

TGTTAACGGTGTGGTTAAGGAGAAGAAGAAAACAGAGAAAAAGCGTGGGAAGACTGAGAA 

AACAATGAGTCTAGATGTACTTCAGCAGTATTTC^ 

GAGCCTAGGAGTTTGCCCGACGACAATGAAGCGAATTTGCAGGCAACACGGAATCTCGCG 
GTGGCCATCGAGGAAGATCAAGAAAGTGAATCGTTCAATCACA7VAGCTGAAACGAGTCAT 
CGAATCTGTTCAAGGTACTGATGGAGGCCTCGACCTGACTTCCATGGCCGTTAGTTCCAT 
CCCTTGGACACACGGTCAAACATC^GCACAGCCACTAAACTC^CCC^TGGTTCCAAACC 
ACCTGAGCTACCAAACACCAATAATTCACCTAACCATTGGTCAAGTGATCACAGTCCGAA 
CGAGCCAAATGGTTCGCCTGAGTTACCACCAAGCAAT^ 

GGATGAGAGCGCTGGGACTCCAACCTCTCATGGCTCATGTGACGGTAACCAATTAGATGA 
ACCGAAAGTCCCAAATCAAGATCCGCTCTTCACGGTTGGTGGATCACCCGGGCTCCTTTT 
TCCACCTTATTCTAGAGATCATGATGTATCTGCAGCTTCCTTCGCAATGCCGAACAGGCT 
TCTTGGTTCTATAGACCATTTCCGAGGAATGCTCATTGAAGACGCTGGAAGTTCAAAAGA 
TCTGAGAAATCTCTGCCCCACTGCAGCATTTGACGATAAGTTTCAAGACACAAACTGGAT 
GAACAATGATAATAATAGCAACAACAACTTATACGCTCCCCCAAAGGAAGAGGCCATTGC 
AAATGTTGCATGCGAACCATCAGGCTCAGAAATGAGAACGGTAACAATCAAAGCAAGTTA 
CAAAGACGACATAATACGGTTCAGAATATCCTCGGGTTCAGGTATAATGGAATTGAAGGA 
TGAAGTGGCTAAGAGGCTGAAAGTTGATGCAGGAACGTTCGATATCAAGTATCTTGACGA 
TGATAACGAATGGGTTTTAATAGCTTGTGATGCTGATCTTCAAGAATGTCTCGAGATCCC 
TAGATCCTCCCGCA€GAAAATCGTAAGGCTCTTAGTTCATGATGTAACGACAAATCTAGG 
GAGCTCCTGCGAGAGCACTGGAGAATTGTGACCTGATAATTCATTCGAACTCTTTTGTAA 
ATAG 

>G2192 Amino Acid Sequence (conserved domain in AA coordinates : 600-7 00) 
MCEPDDNSARNGVTTQPSRSRELLMDVT>D^ 

EQPCSPLWAFSDGGGNGFHHATSGGDDEKISSVSGVPSFRIiAEYPLFLPYSSPSAAENTT 
E KHNS FQ F P S PLM S L VP P ENTDNYCV I KERMTQ ALRY F KE S TEQHVLAQ VWAP VR KNGRD 
LLTTLGQPFVXjNPNGNGLNQYRMISLTYMFSvXJSESDVFILGLPGRWRQKIiPEWTPNVQY 
YSSKEFSRIjDHALHYNVRGTLAIjPVFNPSGQSCIGVVELIMTS 
AVNLKSSEILDHQTTQICNESRQNALAEILEVLTW 
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GGLKKNCTSFDGSCMGQICMSTTDMACYVVDAHVWGFRD^ 

GSCFCRDITKI'CKTQYPLVHYAIjMFKIiTTCFAISLQSSYTGDDSYILEFFLPSSITDDQB 
QDLLLGSILVTMKEHFQSLRVASGVDFGEDDDKLSFEIIQALPDKKVHSKIESIRVPFSG 
FKSNATETMIiIPQPWQSSDPWEKINVATWGVVKEKKKTEKKRGICrE 
FTGSI.KDAAKSLGVCPTTMKRICRQHGISRWPSRKIKKWRSITKLKRVIESVQGTDGGL 
DLTSMAVSSIPWTHGQTSAQPLNSPNGSKPPELPNTNNSPNHWSSDHSPNEPNGSPEIiPP 
SNGHKRSRTVDESAGTPTSHGS CDGNQLDEPKVPNQDPLFTVGGS PGLLFPPYSRDHDVS 
AAS FAMPNRLLGS IDHFRGML I EDAGS S KDLRNLCPTAAFDDKFQDTNWMNNDNNSNNNIi 
YAPPKEEAIAITVACEPSGSEMRTVTIKASYKDDI^^ 

GTFDIKYLDDDNEWVLIACDADLQECLEIPRSSRTKIVRLLVHDVTTNLGSSCESTGEIi* 
>G504 (69.. 1040) 

CGTCGACCTCTTGACGATCATGAGACTGATTTCGTGAAAATATCGTCATTATATCAAATT 
AGAAGTTGATGGAAAACATGGGGGATTCGAGCATAGGGCCGGGCCATCCGCATCTCCCTC 
CCGGGTTTCGGTTTCACCCGACTGATGAGGAACTAGTAGTTCATTACCTCAAGAAGAAAG 
CAGATTCTGTTCCACTTCCAGTCTCAATCATCGCAGAGATTGATCTTTACAAGTTTGATC 
CTTGGGAGCTTCCAAGCAAGGCGAGTTTTGGAGAGCACGAGTGGTACTTCTTTAGTCCTC 
GGGATCGGAAGTATCCAAATGGGGTTAGGCCAAACCGGGCAGCAACTTCCGGTTATTGGA 
AAGCAACGGGAACCGATAAACCGATATTTACGTGCAATAGTCACAAGGTTGGTGTCAAGA 
AAGCGCTTGTTTTTTACGGTGGAAAGCCTCCTAAAGGGATAAAAACAGATTGGATCATGC 
ATGAATATCGCCTCACTGATGGTAACCTTAGCACTGCGGCTAAGCCGCCTGACTTAACCA 
CGACAAGGAAAAACTCACTACGGCTAGACGATTGGGTTCTATGTAGGATCTATAAGAAGA 
ATAGTTGAC2\AAGACCAACAATGGAGAGAGTATTACTTAGAGAGGATCTAATGGAAGGCA 
TGCTCTCAAAATCATCTGCTAATTCTTCTTCTACATCAGTACTAGACAACAACGACAACA 
ATAATAACAATAACGAAGAACACTTTTTCGACGGTATGGTCGTTTCTTCAGACAAACGTT 
CCTTGTGTGGTCAATACCGAATGGGCCACGAGGCCTCAGGATCATCTTCATTCGGATCTT 
TCTTATCGAGCAAGAGGTTTCATCATACAGGTGATCTCAACAATGATAACTACAATGTCT 
CTTTTGTTTCGATGCCTAGTGAGATTCCTC^ 

TGGATACGACGTCGTCTCTAGCTGATCATGGGGTTTTAAGACAGGCGTTTCAGCTTCCTA 

ACATGAACTGGCACTCATAATCTATATAGATATATATGTGTGTATCATATATGTATCTAT 

GCAGGCCTAATATAGTTTACACATAAATCATCTGGGGCGGCCGCT 

>G504 Amino Acid Sequence (domain in AA coordinates: TBD) 

MENMGDS S I GPGHPHLPPGFRFHPTDEELVVHYTjKKKADS VPLPVS I IAE IDI/YKFDP WE 

LPSKASFGBHEWYFFSPRDRKYPNGVRPNRAATSGYWKATGTDKPIFTC 

VFYGGKPPKGIKTDWIMHEYRLTDGNLSTAAKPPDLTT^ 

QRPTMERVLLREDLMEGMLSKSSANSSS^^ 

GQYRMGHEASGS SS FGSFLS S KRFHHTGDLNNDNYNVS FVSMLSEIPQSSGFHANGVMDT 

TSSLADHGVLRQAFQLPNMNWHS* 

>G622 (248.. 2620) 

TCTTTCTTTCTTC2\ATTCGCCGTCAAAATCTTCTCTTTCTCTTCCCCCGCCGGTCCTTCA 

CCAATCCTCTGATCTCTCTACACACGAACCTTTGATTTTGACCAACGTCGATGCATGTTC 

AJTGACTAGTCTCTTCCTCAATCCTTCAATTTCATCAATTCACGTCGATTT^ 

TCGTTGTTCTAGCTCTTTGTGTGGTGTTAGGGTTTTAAGATTTTGGAATTGGGGTTTGGA 

GTTTGTGATGTTTGAAGTCAAAATGGGGTCAAAGATGTGCATGAACGCTTCATGTGGTAC 

GACTTCTACTGTTGAATGGAAGAAAGGTTGGCCTCTTCGATCTGGTCTTCTCGCTGATCT 

CTGTTATCGTTGCGGATCTGCGTATGAGAGTTCTCTATTCTGTGAACAATTTCATAAGGA 

CCAATCTGGTTGGAGGGAATGCTATTTGTGTAGCAAGAGACTACATTGTGGATGCATTGC 

TTCTAAGGTAACGATTGAGTTAATGGACTATGGTGGTGTTGGTTGTAGTACATGTGCTTG 

CTGCCATCAACTCAATTTGAACACAAGGGGTGAG 

AATGAAAACGTTAGCTGATAGGCAACATGTAAATGGCGAAAGCGGAGGAAGAAACGAAGG 
CGATCTCTTTTCTCAGCCACTAGTCATGGGCGGAGATAAAAGGGAAGAGTTCATGCCTCA 
CCGTGGGTTTGGTAAGCTAATGAGTCCAGAAAGTACAACCACCGGGCATAGGCTGGATGC 
TGCTGGGGAAATGCATGAATCATCACCTTTACAGCCATCTTTAT^ATATGGGTTTGGCTGT 
GAATCCGTTTAGCCCATCTTTTGCAACCGAGGCTGTCGAGGGAATGAAACACATCAGTCC 
TTCTCAGTCCAACATGGTCCATTGCTCTGCTTCTAATATACTGCAAAAGCCATCAAGACC 
TGCTATTTCAACTCCTCCTGTGGCTAGTAAATCCGCTCAGGCGCGGATTGGAAGGCCTCC 
TGTCGAAGGGCGAGGGAGAGGCCACTTGCTTCCGCGGTATTGGCCAAAATATACGGATAA 
AGAGGTTCAGCAGATCTCTGGAAATTTGAATTTGAAC^TTGTACCTCTCTTTGAGA2\AAC 
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TCTTAGTGCCAGTGATGCTGGTCGCATTGGTCGTCTAGTTCTTCCAAAAGCCTGTGCAGA 
GGCATATTTTCCTCCGATTAGTCAATCCGAAGGCATTCCTTTGAAAATCCAAGATGTGAG 
GGGTAGGGAGTGGACGTTCCAGTTCAGATATTGGCCCAATAACAATAGTAGAATGTATGT 
TTTAGAAGGTGTCACTCCATGCATACAGTCCATGATGCTACAGGCTGGTGATACAGTAAC 
TTTCAGTCGGGTTGATCCTGGCGGAAAACTAATCATGGGTTCCAGGAAGGCAGCTAATGC 
TGGAGACATGCAGGGTTGTGGGCTCACCAACGGAACATCAACTGAGGACACATCATCGTC 
TGGTGTAACAGAAAACCCACCCTCCATAAATGGTTCCTCGTGTATTTCACTAATACCGAA 
AGAGTTGAATGGTATGCCTGAGAATTTGAACAGTGAGACTAACGGGGGCAGGATAGGTGA 
TGATCCTACACGAGTTAAAGAGAAGAAGAGAACTCGAACCATTGGTGCAAAAAATAAGAG 
ACTTCTTTTGCATAGTGAAGAATCTATGGAGCTGAGACTCACTTGGGAAGAAGCTCAGGA 
CTTGCTTCGTCCCTCTCCTAGTGTAAAGCCTACCATCGTTGTCATTGAGGAGCAAGAAAT 
TGAAGAATATGACGAACCTCCTGTCTTTGGAAAGAGGACTATAGTCACTACAAAACCTTC 
AGGTGAACAGGAACGATGGGCAACTTGCGACGACTGCTCTAAATGGAGAAGGTTACCTGT 
AGATGCTCTTCTTTCCTTTAAATGGACATGTATAGACAATGTTTGGGATGTGAGTAGGTG 
TTCATGTTCTGCACCGGAGGAGAGTCTGAAGGAACTTGAGAATGTTCTTAAAGTAGGTAG 
AGAGGAXIIAAGAAGAGAAGAACTGGGGAAAGACAGGCAGCAC^ 

TGGTTTGGACGCACTGGCGAGTGCAGCAGTCTTAGGAGACACAATAGGCGAGCCAGAGGT 

AGCGACCACGACCAGACATCCAAGGCACAGGGCTGGATGCTCTTGCATCGTGTGCATTCA 

GCCACCAAGTGGGAAAGGTAGGCACAAGCCTACATGTGGCTGCACTGTGTGTAGCACCGT 

GAAGAGAAGGTTCAAGACGCTTATGATGAGGAGGAAGAAGAAGCAGTTGGAGCGCGATGT 

AACAGCAGCAGAAGATAAGAAGAAGAAGGACATGGAACTGGCTGAGTCTGATAAGAGTAA 

GGAGGAGAAGGAAGTGAACACAGCGAGAATAGACCTGAACAGTGATCCATACAATAAAGA 

AGATGTTGAAGCTGTTGCGGTGGAGAAAGAAGAGAGTCGAAAAAGAGCAATAGGACAGTG 

TTCGGGCGTGGTGGCTCAAGACGCCAGTGATGTTTTAGGAGTTACAGAGTTAGAAGGAGA 

GGGTAAGAATGTTCGTGAAGAGCCGAGAGTTTCAAGCTGATATGGAAA 

>G622 Amino Acid Sequence (domain in AA coordinates: TBD) 

MFEVKMGSKMCMNASCGTTSTVEWKKGW 

GWRECYIjCSKRLHCGCIASKVTIELMDYGGVGCSTCACCHQIjNIjN^ 

TLADRQHVNGESGGRNEGDLFSQPLVMGGDKREEFMPHRGFGKLMSPESTTTGHRIjDAAG 
EMHESSPIjQPSIiNMGIiAVNPFSPSFATEAVEGMKHISPSQSNMVHCSASNILQKP 
STPPVASKSAQARIGRPPVEGRGRGHLLPRYWPKYTDKEVQQI^ 
ASDAGRIGRLVIiPKACAEAYFPPISQSEGIPIjKIQDVRGREWTFQ 

GVTPCIQSMMLQAGDTVTFSRVDPGGKLIMGSRKAANAGDMQGCGLTNGTSTEDTSSSGV 

TEOTPSINGSSCISLIPKELNGMPENIxNSETNGGRIGDDPTRVKEKKRTRTIGA™ 

LHSEESMELRLTWEEAQDLLRPSPSVKPTIWIEEQEIEEYDEPPWGKRTIVTTKPSGE 

QERWATCTDCSKWRRLPVDALLSFKWTCIDNVWDVSRCSCSAPEESLKELEN^ 

KKRRTGERQAAQSQQE PCGLDAIjASAAVIiGDTI GEPEVATTTRHPRHRAGCS CI VC I QPP 

SGKGRHKPTCGCTVCS TVKI^FKTIiMMRRKKKQLERDVTAAEDKKKKDMELAESDKS KEE 

KEWTARIDLNSDPY*nOEDVEAVAVEKE^ 

NVREEPRVSS* 

>G778 (50.. 1249) 

TCTCAATAACACAAAACCTTTTAAACTAGTAAAATACACAGATTTTAGGATGAGCCAATG 
TGTTCCAAACTGTCACATCGATGATACTCCGGCAGCAGCCACCACCACCGTCCGCTCCAC 
CACAGCCGCAGACATCCCCATATTAGACTACGAGGTAGCCGAGCTGACGTGGGAGAACGG 
GCAACTAGGCTTGCACGGCTTAGGTCCACCGCGAGTGACGGCTTCGTCGACCAAGTACTC 
CACAGGCGCCGGTGGAACGTTGGAGTCGATAGTGGACCAAGCTACTCGCCTCCCTAACCC 
TAAGCCCACGGATGAGCTCGTCCCGTGGTTCCATCATCGCTCCTCCAGGGCCGCGATGGC 
AATGGACGCGCTTG^CCCTTGCTCC^^ 

CGTTGGCTCCACCCGGGTGGGGTCATGTAGCGATGGTCGTACCATGGGCGGTGGAAAACG 
AGCAAGAGTGGCACCGGAGTGGAGCGGCGGCGGGAGTCAGCGGCTGACCATGGACACTTA 
CGACGTAGGTTTCACCTCAACATCAATGGGCTCGCACGATAACACAATCGACGATCATGA 
CTCCGTCTGCCACAGCCGCCCACAGATGGAGGACGAAGAAGAGAAGAAAGCCGGAGGAAA 
ATCATC^GTTTC^CCAAGAGAAGCAGAGCTGCTGCTATTCATAACCAATCCGAACGT^ 
GAGGAGAGATAAAATCAATCAAAGGATGAAGACTTTGGAAAAACTGGTTCCCAATTCCAG 
CAAGACGGATAAAGCATCTATGTTGGATGAAGTGATAGAGTATTTGAAGCAACTTCAAGC 
ACAAGTGAGCATGATGAGCAGAATGAATATGCCTTCTATGATGCTTCCTATGGCCATGCA 
GCAACAACAACAACTACAAATGTCTCTCATGTCCAATCCCATGGGTTTAGGGATGGGCAT 
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GGGGATGCCCGGTCTCGGTCTCCTCGACCTTAATTCTATGAACCGAGCTGCTGCAAGCGC 
TCCTAATATCCATGCCAACATGATGCCAAACCCATTTTTGCCCATGAATTGTCCATCGTG 
GGATGCTTCTTCCAATGACTCTCGATTTCAGTCTCCTCTCATCCCCGATCCTATGTCTGC 
CTTTCTTGCATGCTCTACTCAGCCAACGACGATGGAAGCGTATAGCAGGATGGCTACATT 
ATATCAGC^\AATGCAACAACAACTTCCTCCTCCTTCGAATCCAAAATGATTATTACTCAA 
ACACCTCTATATAGTTTACGTCTATATATGTGTTAGTCACATACATACATATATATATTC 
CATCATAATTATTTATTTATATGTATAGGCTTCTCATGAATTATGATATTATACGTATTA 
CGTAAAAAA 

>G778 Amino Acid Sequence (domain in AA coordinates: 220-267) 
MS QCVPNCHIDDTPAAATTTVRSTTAAD I P I LDYEVAELTWENGQLGLHGLGPPRVTAS S 
TKYSTGAGGTLESIVDQATRLPNPKPTDELVPWFHHRSSRAAMA^ 

kpggvgstrvgscsdgrtmgggkrarvapewsgggsqrltmdtydvgftstsm 

ddhdsvchsrpq^ledeeekkaggkssvstkrsraaaihnqserkrrdkinqrmktxlqklv 

'pnssktdkasmldevieylkqlqaqvsmmsrmnmpsm^ 

gmgmgmpglgliidiinsmnraaasapnihanmmpnpf^ 

pmsaflacstqpttmeaysrmatlyqqmqqqlpppsnpk* 

>G791 (173.. 877) 

TTTTCTTTGGGTGTTCCTTCCACCAACGGCAGAAATCGATTCGGCTTAAATCTCCCCCTC 

CTTTCGATCTCTCTGATCGCCGCCGGGAACATTCAATTTCCCGGGAGTTCAACAAAAAAA 

AAACTCTCCGTTTTTATTTTTCCCCCTTTTTCACCGGTGGAAGTTTCCGGAGATGGTGTC 

ACCCGAAAACGCTAATTGGATTTGTGACTTGATCGATGCTGATTACGGAAGTTTCACAAT 

CCAAGGTCCTGGTTTCTCTTGGCCTGTTGAGCAACCTATTGGTGTTTCTTCTAACTCCAG 

TGCTGGAGTTGATGGCTCGGCTGGAAACTCAGAAGCTAGCAAAGAACCTGGATCCAAAAA 

GAGGGGGAGATGTGAATCATCCTCTGCCACTAGCTCGAAAGCATGTAGAGAGAAGCAGCG 

ACGGGACAGGTTGAATGACAAGTTTATGGAATTGGGTGCAATTTTGGAGCCTGGAAATCC 

TCCCAAAACAGACAAGGCTGCTATCTTGGTTGATGCTGTCCGCATGGTGACACAGCTACG 

GGGCGAGGCCCAGAAGCTGAAGGACTCCAATTCAAGTCTTCAGGACAAAATCAAAGAGTT 

AAAGACTGAGAAAAACGAGCTGCGAGATGAGAAACAGAGGCTGAAGACAGAGAAAGAAAA 

GCTGGAGCAGCAGCTGAAAGCCATGAATGCTCCTCAACCAAGTTTTTTCCC^ 

TATGATGCCTACTGCTTTTGCTTCAGCGCAAGGCCAAGCTCCTGGAAACAAGATGGTGCC 

AATCATCAGTTACCCAGGAGTTGCCATGTGGCAGTTCATGCCTCCTGCTTCAGTCGATAC 

TTCTCAQGATCATGTCCTTCGTCCTCCTGTTGCTTAATCAAGAAAAATCATCAACCGGTT 

TGCTTCTTGCTTCCGCTTAAAAGAAAAGTCTCCATTTGTTTTGCTCTCCTCTCTTTCTCG 

GCTTTCTTAGTCTTATCCTTTTGCTTTGTCGTGTTATCATCGTAACTGTTATCTGTTGAA 

CAATGATATGAC^TTGTAAACTCCAATTGCTOCGCGGAATGTTATCTATTCAC^ 

TTTAAGTAGAGTTTGGCAAAAAAAAAA 

>G791 Amino Acid Sequence (domain in AA coordinates: 75-143) 
MVSPENANWICDLIDADYGSFTIQGPGFSWPVQQPIGVSSNSSAGVDGSAGNSEASKEPG 
S KKRGRCES S SAT S S KACREKQRRDRLNDKFMELGAILEPGNPPKTD KAAILVDAVRMVT 
QLRGEAQKLKDSNSSLQDKIKELKTEKNELRDEKQ^ 

APPMMPTAFASAQGQAPGNKMVPI I S YPGVAMWQFMPPASVDTSQDHVIjRPPVA* 
>G861 (158. .880) 

CTTCTTCCTCCTCCTCGATCTCTTCTCTTTACTCTCTCTTTAATCATCTCTCATTCTTGA 

ATCTTGATCCATCAAAATCAATCCCGTTCTCGAAAGATCCATTAAAATCAAAACCTAAGC V 

TCTCTCTCTTGCTTCTAGGGTTTTTTTGTTCGTTGTGATGGCGAGAGAAAAGATTCAGAT 

CAGGAAGATCGACAACGCAACGGCGAGACAAGTGACGTTTTCGAAACGAAGAAGAGGGCT 

TTTCT^AGAAAGCTGAAGAACTCTCCGTTCTCTGCGACGCCGATGTCGCTCTCATCATCTT 

CTCTTCCACCGGAAAACTGTTCGAGTTCTGTAGCTCCAGCATGAAGGAAGTCCTAGAGAG 

GGATAACTTGCAGTGAAAGAACTTGGAGAAGCTTGATCAGCCATCTCTTGAGTTACAGCT 

GGTTGAGAACAGTGATCACGCCCGAATGAGTAAAGAAATTGCGGACAAGAGCCACCGACT 

AAGGCAAATGAGAGGAGAGGAACTTCAAGGACTTGA(^TTGAAGAGCTTCAGCAGCTAGA 

GAAGGCCCTTGAAACTGGTTTGACGCGTGTGATTGAAACAAAGAGTGACAAGATTATGAG 

TGAGATCAGCGAACTTCAGAAAAAGGGAATGCAATTGATGGATGAGAACAAGCGGTTGAG 

GCAGCAAGGAACGCAACTAACGGAAGAGAACGAGCGACTTGGCATGCAAATATGTAACAA 

TGTGCATGCACACGGTGGTGCTGAATCGGAGAACGCTGCTGTGTACGAGGAAGGACAGTC 

GTCGGAGTCTATTACTAACGCCGGAAACTCTACCGGAGCGCCTGTTGACTCCGAGAGCTC 

CGACACTTCCCTTAGGCTCGGCTTACCGTATGGTGGTTAGAGATGGAACAATTCAAAGAA 
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GTTGATGGAGTGAGGAGAGTAATGTAAATCTTTTTAACTCGGTAGTAACAAGAGACAATG 

TCTAAGTAGTGAATTCTCAAATGTTTGTGTAAGTTTCTGCCTATGGAAGAGGCTTTCATT 

TTTATGATTTTCACTATGTATGATCTCTCTTCACTGCATTTCTGGTTAGTAACGGCTTGT 

CACCGATAAACTTTCTCGTTATGGAAAGTTAGAATAAAA/^AAAAAAAAAAAAAA 

>G861 Amino Acid Sequence (domain in AA coordinates: 2-57) 

MAREKIQIRKIDNATARQVTFSKRRRGLFKKAEEIiSVLCDADVALIIFSSTGKLFEFCSS 

SMKEVLERHNLQSKNLEKLDQPSLELQLVENSDHARMSKEIADKSHRLRQMRGEELQGLD 

IEELQQLEKAXiETGLTRVIETKSDKIMSEISELQKKGMQLMDENKRLRQQGTQLTEENER 

LGMQICNNVHAHGGAESENAAVYEEGQSSESITNAGNSTGAPVDSESSDTSLRIjGLPYGG 

* 

>G938 (1..1755) 

ATGATGATGTTTAACGAGATGGGAATGTATGGAAACATGGATTTCTTCTCTTCCTCCACA 
TCTCTCGATGTGTGTCCATTACC^CAAGOTGAACAAGAACCTGTAGTTGAAGATGTCGAC 
TACACCGATGATGAGATGGATGTGGATGAGCTTGAGAAGAGGATGTGGAGAGACAAAATG 
CGTTTGAAACGTCTCAAGGAGCAACAGAGTAAGTGTAAAGAAGGCGTCGATGGTTCGAAA 
CAGAGGCAGTCGCAAGAGCAAGCTAGGAGGAAGAAAATGTCTAGAGCCCAAGATGGGATC 
TTGAAGTATATGTTGAAGATGATGGAAGTTTGTAAAGCTCAAGGCTTTGTTTATGGTATT 
ATTCCTGAGAAGGGTAAGCCTGTGACTGGTGCTTCGGATAATTTGAGGGAATGGTGGAAA 
GATAAGGTTAGGTTTGATCGTAATGGTCCAGCTGCTATTGCTAAGTATCAGTCAGAGAAT 
AATATTTCTGGAGGGAGTAATGATTGTAACAGCTTGGTTGGTCCAACACCGCATACGCTT 
CAGGAGCTTCAGGACACGACTCTTGGTTCGCTTTTATCGGCTTTGATGCAACATTGTGAT 
CCACCGCAGAGACGGTTTCCTTTGGAGAAAGGAGTTTCTCCACCTTGGTGGCCTAATGGG 
AATGAAGAGTGGTGGCCTCAGCTTGGTTTACCAAATGAGCAAGGTCCTCCTCCTTATAAG 
AAGCCTCATGATTTGAAGAAAGCTTGGAAAGTCGGTGTTTTAACTGCGGTGATCAAGCAT 
ATGTCGCCGGATATTGCGAAGATCCGTAAGCTTGTGAGGCAATCAAAATGCTTGCAGGAT 
AAGATGACGGCGAAAGAGAGTGCTACTTGGCTTGCCATTATTAACCAAGAAGAGGTTGTG 
GCTCGGGAGCTTTATCCCGAGTCATGCCCTCCTCTTTCTTCTTCTTCATCATTAGGAAGC 
GGGTCGCTTCTCATTAATGATTGTAGCGAGTATGACGTTGAAGGTTTCGAGAAGGAACAA 
CATGGTTTCGATGTGGAAGAGCGGAAACCAGAGATAGTGATGATGCATCCTCTAGCAAGC 
TTTGGGGTTGCTAAAATGCAACATTTTCCCATAAAGGAGGAGGTCGCCACCACGGTAAAC 
TTAGAGTTCACGAGAAAGAGGAAGCAGAACAATGATATGAATGTTATGGTAATGGACAGA 
TCAGCAGGTTACACTTGTGAGAATGGTCAGTGTCCTCACAGCAAAATGAATCTTGGATTT 
CAAGACAGGAGTTCAAGGGACAACCACCAGATGGTTTGTCCATATAGAGACAATCGTTTA 
GCGTATGGAGCATCCAAGTTTCATATGGGTGGAATGAAACTAGTAGTTCCTCAGCAACCA 
GTCCAACCGATCGACCTATCGGGCGTTGGAGTTCCGGAAAACGGGCAGAAGATGATCACC 
GAGCTTATGGCCATGTACGACAGAAATGTC CAAAGCAACCAAACGCCTC CTACTTTGATG 
GAAAACCAAAGCATGGTCATTGATGCAAAAGCAGCTCAG 

AGTGGCAATCAAATGTTTATGCAACAAGGGACGAACAACGGGGTTAACAATCGGTTCCAG 
ATGGTGTTTGATTCGACACCATTCGATATGGCAGCATTCGATTACAGAGATGATTGGCAA 
ACCGGAGCAATGGAAGGAATGGGGAAGCAGCAGCAGCAGCAGCAGCAGCAGCAAGATGTA 
TCAATATGGTTCTGA 

>G938 Amino Acid Sequence (domain in AA coordinates: 96-104) 
mt^fimsmgmygnmdff 

rl:k3*lkeqqskckegvdgskqrqsqeqarrkkmsra 

IPEKGKPVTGASDNIJIEWWKDKVRFDRNGPAAIAKYQSENNISGGSNDCNSIjVGPTPHTL 
QEIiQDTTLGSLLSALMQHCDPPQRRFPLEKGVSPPWWPNGNEEWWPQLGLPNEQGPPPYK 
KPHDLKKAWKVGVLTAVI KHMS PD I AKIRKLVRQS KCLQDKMTAKES ATWLAI INQEEW 
ARELYPESCPPLSSSSSLGSGSLLINDCSEYDVEGFEKEQHGFDVEERKPEIVMMHPLAS 
FGVAKMQHFP IKEEVATTVNLEFTRKRKQNNDMNVKVMDRS^ KMNLGF 
QDRSSRDiraQIWCPYRDNRIAYGASKFHMGGMKLWPQQPVQPIDLSGVGVPENGQKMIT 
ELMAMYDRNVQSNQTPPTLMENQ 

MVFD S TPFDMAAFD YRDD WQTGAMEGMGKQQQQQQQQQDVS I WF * 
>G965 (73.. 1956) 

GATTCTCTGTGTATGTCTGAATCCTTACAGGATCCAAGAGCTTTGGAAAAAAGATATAAT 
GAATAACAAGATATGGGTTTAGCTACTACAACTTCTTCTATGTCACAAGATTATCATCAT 
CACCAAGGAATCTTTTCCTTCTCTAATGGATTCCACCGATCATCATCAACCACTCATCAG 
GAGGAAGTAGATGAATCCGCCGTCGTCTCCGGTGCTCAAATTCCGGTTTATGAAACCGCC 
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GGAATGTTGTCTGAAATGTTTGCTTACCCTGGCGGAGGTGGCGGCGGTTCCGGTGGAGAG 

ATTCTTGATC^GTCTACTAAAC^GTTGCTAGAGCAACAA^CCGTCACAAC^CAACAAT 

AACTCAACTCTTCATATGTTATTACCAAATCATCATCAAGGTTTTGCTTTCACCGACGAA 

AACACTATGCAGCCGCAGCAACAACAACACTTTACATGGCCATCTTCCTCCTCCGATCAT 

CATCAAAACCGAGATATGATCGGAACCGTCCACGTGGAAGGAGGAAAGGGTTTGTCTTTA 

TCTCTCTCATCTTCATTAGCCGCAGCTAAAGCCGAGGAATATAGAAGCATTTATTGTGCA 

GCCGTTGATGGAACTTCTTCTTCTTCTAACGCATCCGCTCATCATCATCAATTCAATCAG 

TTCAAGAATCTTCTTCTTGAGAATTCTTCTTCTCAACATC^TCACCATCAAGTTGTTGGA 

CATTTTGGTTCATCATCATCATCTCCCATGGCGGCTTCTTCATCCATTGGAGGGATCTAC 

ACGTTGAGGAATTCGAAATATACGAAACCGGCTCAAGAGTTGTTGGAAGAGTTTTGTAGT 

GTTGGAAGAGGACATTTCAAGAAGAACAAACTTAGTAGGAACAACTCAAACCCTAATACT 

ACCGGTGGAGGAGGAGGCGGAGGGTCCTCGTCATCGGCCGGAACAGCTAATGATAGTCCT 

CCTTTGTCTCCGGCTGATCGGATTGAACATCAAAGAAGAAAAGTCAAGCTACTATCTATG 

CTTGAAGAGGTGGACCGACGGTACAACCACTACTGCGAACAAATGCAAATGGTAGTGAAC 

TCATTCGACCAAGTAATGGGTTACGGCGCGGCGGTTCCGTACACGACATTAGCTCAAAAG 

GCAATGTCTAGGCATTTCCGGTGTTTGAAAGACGCGGTAGCGGTTCAGCTTAAACGCAGC 

TGTGAGCTTCTAGGGGATAAAGAGGCGGCAGGGGCTGCATCCTCGGGGTTAACCAAAGGG 

GAAACGCCGCGATTGCGTTTGCTAGAGCAGAGTTTGCGTCAGCAACGAGCGTTTCATCAT 

ATGGGTATGATGGAGCAAGAGGCATGGAGACCGCAACGTGGTTTGCCTGAACGCTCCGTT 

AATATCCTTAGAGCTTGGCTATTCGAGCATTTTCTTAATCCGTACCCAAGCGATGCTGAT 

AAGCACCTCTTAGCACGACAGACTGGTTTATCCAGAAATCAGGTGTCAAATTGGTTCATA 

AATGCTAGGGTTCGCCTATGGAAACCAATGGTGGAAGAGATGTATCAACAAGAAGCAAAA 

GAAAGAGAAGAAGCAGAAGAAGAAAATGAAAATCAACAACAACAAAGAAGACAGCAAC^ 

ACAAACAACAACGACACGAAACCCAACAACAATGAAAACAACTTCACTGTCATAACCGCA 

CAAACTCCAACGACGATGACATCGACACATCACGAAAACGACTCTTCATTCCTCTCTTCC 

GTCGCCGCCGCTTCTCACGGCGGTTCAGACGCGTTCACCGTCGCCACGTGTCAGCAAGAC 

GTCAGTGACTTCCACGTCGACGGAGATGGTGTGAACGTCATAAGATTCGGGACCAAACAG 

ACTGGTGACGTGTCTCTTACGCTTGGTCTACGCCACTCTGGCAATATTCCTGATAAGAAC 

ACTTCTTTCTCGGTTAGAGACTTTGGAGATTITTAGTCTTCTTTGTTTCTCAATTTATTC 

ATC 

>G965 Amino Acid Sequence (domain in AA coordinates: 423-486) 
MGIiATTTSSMSQDYHHHQGIFSFSNGFHRSSSTTHQEEVDESAWSGAQIPVYETAGMIjS 
EMFAYPGGGGGGSGGEILDQSTKQLLEQQNRHtt^ 
PQQQQHFTWPSSSSDHHQNRDMIGTVHVEGGKGL^ 

TSSSSNASAHHHQFNQFKNLLIiENSSSQHHHHQVVGHFGSSSSSPMAASSSIGGIYTLRM 

SKYTKPAQELLEEFCSVGRGHFKKNKLSRHNSOTNTTGGGGGG 

ADRIEHQRRKVKLLSMLEEVDRRYimYCEQM 

HFRCLKDAVAVQLKRSCELLGDKEAAGAASSGLTKGETPRLRIjIjEQSIjRQQRAFHHMGMM 

EQEAWRPQRGLPERSVNIIiRAWLFEHFLNPYPSDADKHIiIiARQ 

RLWKPMVEE^QQEAKEREEAEEENENQQQQRRQQQTNNNDTI^ 

TMTSTHHENDSSFLSSVAAASHGGSDAFTVATCQQD^ 

SLTLGLRHSGNI pdknts FS VRDFGDF * 

>G1143 (54.. 677) 

AAATAAGAATATAAACACTTTTGTCTGAAAAATTATCAAAGAAGAAGAAATAAATGGGTG 

GAGGAAGCAGATTTCAAGAACCAGTGAGGATGAGCCGTAGGAAACAAGTAACAAAAGAGA 

AGGAAGAAGATGAAAACTTCAAATCTCCAAATCTTGAAGCAGAGAGACGTAGAAGAGAGA 

AGCTTCATTGTCGGCTTATGGCTCTGCGATCTCATGTCCCCATTGTCACCAACATGACTA 

AAGCAAGTATTGTTQAAGATGCGATTACTTACATAGGAGAGCTTCAAAACAATGTTAAGA 

ATCTCTTAGAGACATTTCATGAAATGGAAGAAGCTCCTCCTGAGATTGATGAAGAACAAA 

CGGATCGAATGATAAAACCTGAAGTTGAAACTAGTGATCTTAACGAAGAGATGAAGAAAC 

TCGGAATCGAGGAGAATGTGCAATTGTGTAAGATTGGGGAGAGGAAGTTTTGGTTAAAGA 

TCATAACAGAGAAGAGAGATGGGATCTTTACTAAATTCATGGAGGTTATGAGATTTCTCG 

GATTCGAGATTATCGATATTAGTCTAACAACTTCAAATGGAGCAATTCTTATTAGTGCCT 

CTGTTCAGAG^CAGGAACTCTGTGATGTTGAACAGACAAAAGATTTTCTTTTGGAAGTTA 

TGAGAAGC^UVTCCATAAGTATTAATTATATACATCTTGGAAATTTCTTGATCTAATAACA 

TTTCCATTGGTTTTTATTACATTGTTGT^ 

AAGAGTTTGTGTTACAAGCCAATGA 
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>G1143 Amino Acid Sequence (domain in AA coordinates : 33 -82) 

MGGGSRFQEPWMSRRKQVTKEKEEDENFKSPNLEAERRRREKLH 

MTKASIVEDAITYIGELQNNVKNLLETFHEMEEAPPE 

KKLG IEENVQLCKI GERKF WLKI ITEKRDGI FTKFMEVMRFIjGFE I ID I SIjTTSNGAIIiI 

SASVQTQELCDVEQTKDFLLEVMRSNP* 

>G1190 (209. .2020) 

TCCTGTCCCAAAACCAAAAGACTTGAGAGTGTGTCTTTAGAGAGAGATCTTCTCTCTTTT 
ATCTTACGACTCTCACTTCTTATCTCAAATCTACTTCAAOTCTATTTCCAGTCTCCACA^ 
TTTCCCACAAATTTCAACTCTTGTTCTCTTCCTCGAAAGTAAAAAACAAATCGTTGCAAG 
TGAGGTTTGGTTTTGGTGTTATAGAATTATGAAGAGCGGGAAGCAATCTTCGCAACCTGA 
AAAGGGTACTTCCAGGATCTTGTCACTGACTGTCCTGTTTATCGCATTTTGCGGTTTCTC 
CTTCTACCTCGGTGGTATATTTTGCTCTGAGAGAGACAAGATTGTAGCCAAGGATGTCAC 
AAGGACGACTACAAAGGCTGTAGCTTCCCCTAAAGAACCTACAGCTACTCCTATTCAAAT 
CAAATCCGTTTCTTTCCCGGAGTGCGGGTCAGAGTTCCAAGATTACACCCCGTGCACCGA 
TCCAAAGAGGTGGAAGAAGTATGGTGTCCATCGCTTAAGTTTCTTGGAGCGTCATTGTCC 
TCCGGTATATGAAAAGAATGAGTGTTTGATTCCACCACCAGACGGGTATAAACCGCCTAT 
AAGATGGCCCAAGAGCCGAGAACAGTGTTGGTACAGGAACGTGCCTTATGATTGGATCAA 
TAAGCAAAAGTCTAACCAGC^TTGGCTTAAGAAAGAAGGAGATAAGTTCCATTTCCCTGG 
TGGTGGTACCATGTTCCCTCGTGGAGTTAGTCACTATGTTGATTTGATGCAAGATCTGAT 
TCCTGAAATGAAAGACGGAACAGTCAGGACCGCCATTGATACTGGCTGTGGGGTTGCGAG 
CTGGGGAGGCGATCTTTTGGACCGTGGGATACTATCACTCTCTCTTGCTCCAAGAGATAA 
CCATGAA(^TCAGGTTCAATTTGCTCTTGAACGTGGAATTCCTGCGATTCTCGGGATCAT 
CTCTACGCAACGTCTCCCTTTTCCTTCAAATGCATTTGATATGGCTCATTGTTCAAGATG 
TCTTATTCCCTGGACAGAATTTGGTGGAATCTATTTACTTGAGATTCACCGTATAGTTCG 
ACCTGGAGGTTTTTGGGTTCTTTCTGGTCCACCTGTGAACTATAATAGACGATGGCGTGG 
ATGGAACACAACCATGGAAGATCAGAAATCTGACTACAACAAGCTTCAGTCACTTCTAAC 
CTCCATGTGTTTCAAAAAGTACGCTCAAAAAGATGACATAGCCGTGTGGCAGAAACTCTC 
AGACAAATCTTGCTATGACAAAATCGCTAAGAACATGGAAGCTTACCCTCCCAAATGTGA 
CGACAGTATAGAACCTGATTCTGCTTGGTACACTCCACTCCGTCCTTGCGTGGTTGCCCC 
GACACCTAAAGTCAAGAAGTCTGGTCTCGGATCAATCCCAAAATGGCCCGAGAGGTTACA 
TGTCGCGCCCGAGAGAATCGGTGATGTTCACGGAGGGAGTGCGAACAGTTTGAAACACGA 
* TGATGGTAAATGGAAGAACAGAGTTAAGCATTACAAGAAAGTTTTACCAGCTCTTGGGAC 
AGACAAGATAAGAAATGTTATGGATATGAACACTGTTTATGGAGGTTTCTCTGCGGCCCT 
CATTGAGGATCCCATTTGGGTCATGAACGTTGTATCATCGTACAGCGCAAATTCGCTTCC 
TGTTGTCTTTGATCGCGGTCTCATCGGGACTTACCACGACTGGTGCGAAGCTTTCTCAAC 
GTATCCAAGAACATATGATCTTCTTGACCT^ 

GTGTGAGATGAAGTACATTTTGCTAGAGATGGACAGGATCTTGCGGCCGAGTGGATATGT 
TATAATCCGAGAATCGAGTTATTTCATGGACGCAATCACAACGTTAGCGAAAGGGATAAG 
GTGGAGTTGCCGGAGAGAGGAGACTGAGTATGCAGTCAAAAGTGAGAAGATTCTGGTTTG 
CCAGAAAAAGCTATGGTTTTCGTCAAACCAAACCTCTTGATGAGACCACCTGTATCATAG 
TGTTTATCATCTCCTGTGATGCACACTACAGAGAGAAGGATCTAGTCCTTTGAGTCCAAG 
ATATAGCTCTATAAACAATCTCCTTTTTTTGTTCTCTTTAATTTCTTGGGTATTTCACGG 
- TATAGATTGATATTATATATTTTTTAATTATATTTTTAATATATAGATATATTAGTATGT 
GGTTTAAACACTATTATTATCAAGGTCTTAAAGATTTGCTTTGCAAGAGTTAAAAAATGT 
TGGAGTAAGGACCTCTTGATTAATAAATTGACTGACGCAGCAAA 

>G1190 Amino Acid Sequence (domain in AA coordinates : entire protein) 

MKSGKQSSQPEKGTSRILSLTVLFIAFCGFSFYLGGIFCSERDKIVAKDVTRTTTKAVAS 

PKEPTATPI QI KS VS F PECGS EFQDYTPCTD PKRWKKYGVHRIjS FLERHC P PVYEKNECL 

IPPPDGYKPPIRWPKSREQCWYRNVPYDWINKQKSNQHWLKKEGDKFHFPGGGTMFPRGV 

SHYVDLMQDLIPEMKDGTVRTAIDTGCGVASWGGDLIjDRGILSIjSIjAPRDNHEAQVQFAL 

ERGIPAILGIISTQRLPFPSNAFDMAHCSRCLIPWTEFGGIYLIjEIHRIVRPGGFWVIiSG 

PPWYl^WRGWOTTMEDQKSDYNI^ 

KNMEAYPPKCDDSIEPDSAWYTPLRPCVV^ 

HGGSANSLKHDDGKWKimVKHYKKVIiPALGTDKIRim^ 

WSSYSANSLPVVFDRGLIGTYHDWCEAFSTYPRTYDLLHLDSLFTLESHRCEMKYIIiLE 
MDRILRPSGWIIRESSYFMDAITTIiAKGIRWSCRREETEYAVKSEKILVCQKKLWFSSN 
QTS* 
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>G1198 (230. .1675) 

TCTTTTCAAATTCCAATCATTTGATCAACTAATCAAGAATTAATTATAAGACTTTGCAAT 
CTCTCTCCCTCTCCCTCTCCCTAGCTAGTTCTCTCTTGTGTTTCTTAACTCGAGCTTCTC 
TCAATAGTGATTATCATCTTTTTCATCATTTCAAGATTTAATGTGTTTTGCAGAAAAGAG 
ACTAATCAAGAAGAGATATCATCAATTGAAGCTGTTTTCTTGAGTAGAGATGGCGAACCA 
TAGAATGAGCGAAGCTACAAACCATAACCACAATCATCATCTTCCTTATTCACTTATTCA 
TGGTCTCAACAACAATCATCCATCTTCTGGTTTCATTAACCAAGATGGATCGTCCAGTTT 
CGATTTTGGAGAGCTAGAAGAAGCAATTGTTCTGCAAGGTGTCAAGTATAGGAACGAGGA 
AGCCAAGCCACCTTTATTAGGAGGAGGAGGAGGAGCTACGACTCTGGAGATGTTCCCTTC 
GTGGCCAATCAGAACTCACCAAACTCTTCCTACTGAGAGTTCCAAGTCAGGAGGAGAGAG 
CAGCGATTCAGGATCGGCTAATTTCTCCGGCAAAGCTGAAAGTCAACAACCGGAGTCTCC 
TATGAGTAGCAAACATCATCTCATGCTTCAACCTCATCATAATAACATGGCAAACTCAAG 
TTCAACATCTGGACTTCCTTCCACTTCTCGAACTTTAGCTCCTCCTAAACCTTCGGAAGA 
TAAGAGGAAGGCTACAACTTCAGGCAAACAGCTTGATGCTAAGACGTTGAGACGTTTGGC 
CCAAAATAGAGAAGCTGCTCGGAAAAGCCGTCTTAGGAAAAAGGCGTATGTGCAACAGCT 
AGAATCAAGTAGGATAAAGCTTTCCCAATTGGAGCAAGAACTTCAGCGAGCTCGTTCTCA 
GGGGCTGTTCATGGGTGGTTGTGGACCACCAGGACCTAACATCACTTCCGGAGCTGCAAT 
ATTTGACATGGAATATGGGAGATGGCTAGAGGATGATAACCGGCATATGTCGGAGATTCG 
AACCGGTCTTCAGGCTCATTTATCTGACAATGATTTAAGGTTGATCGTTGACGGTTACAT 
TGCTCmTTTTGATGAGATATTCCGATTAAAAGCCGTGGCAGCGAAAGCCGATGTTTTTCA 
CCTCATCATTGGGACATGGATGTCCCCAGCCGAACGTTGTTTTATTTGGATGGCTGGTTT 
CCGTCCATCCGACCTAATCAAGATATTGGTGTCGCAAATGGATCTATTGACGGAGCAACA 
ACTGATGGGAATATATAGCCTACAACACTCGTCGGAACAAGCAGAGGAGGCTCTCTCGCA 
AGGCCTCGAACAACTTCAGCAATCTCTCATCGATACTCTCGCCGCATCTCCAGTCATTGA 
CGGAATGCAACAAATGGCTGTCGCTCTCGGAAAGATCTCTAATCTCGAAGGCTTTATCCG 
CC^GGCTGATAACTTGAGGCAGCAGACCGTTCACCAGCTGAGGCGGATCTTGACCGTCCG 
ACAAGCTGGACGGTGTTTCCTAGTCATCGGAGAGTACTATGGACGGCTCAGAGCTCTTAG 
CTCCCTTTGGTTGT(^CGCCCACGAGAGAC^CTGATGAGTGATGAAACCTCTTGTCAAAC 
GACGACGGATTTGCAGATTGTTCAGTCATCTCGGAACCACTTCTCCAATTTCTGAATGGA 
ATGAAACTTTGTATAACTAAAAGGCCAAGTTTGATTGTCTGTCGTAATTTCACCTATTTC 
CTTTAAAGTTGTACTAGAGAAAAGATAGGATCTTCCTTCG 

>G1198 Amino Acid Sequence (domain in AA coordinates: 173-223) 
MANHRMSEATimN^^ 

RNEEAKPPLIiGGGGGATTLEMFPSWPIRTHQTLPTESSKSGGESSDSGSANFSGKAESQQ 
PESPMSSKHHIrf^QPHHNNMAN^ 

RRIiAQlIREAARKSRLRKKAWQQLESSRIKLSQIjEQEIiQRARSQGrjFMGGCGPPGPNIT^ 
GAAIFDMEYGRWIaEDDNRHMSEIRTGLQAHLSDNDLRLIVDGYIAHFDEIFRIjKAVAAKA 
DVFHLIIGTWMSPAERCFIWMAGFRPSDLIKIIiVSQMDLLTEQQLMGIYSLQHSSQQAEE 
AIjSQGLEQLQQSLIDTIiAASPVIDGMQQMAVAIjGKISNLEGFIRQADNIjRQQTVHQLRRI 
LTVRQAARCFIiVIGEYYGRLRAIiSSLWLSRPRETLMSDETSCQTTTDLQIVQSSRiraFS^ 
F* 

>G1226 (212 . .1159) 

CTGCATTTATTAAGAACAGTTTAGAAAGTGTCAACCCCTAAAGGAATGTTTTTAGTTTAG 

AGGAAAGAGAGAGAAGAAGAAGCAGCAGCAGAAGTTGTTAATTTGAAGACTATTTGAGGA 

AAGACACCTATATCTAAATACTCAAAGTTACAAAAATATTACTTCAGAAAACAGTTCCAT 

TAGAGAGACTCATAAAGCTTCTCATCTAATTATGAGTGGATTGATGAGTTTTGGTGAATT 

AGAAGACCAATTTGGTGAGATTTCAGACACTACTATGGAAGAGAAGATACCATTTCTGCA 

AATGCTTCAATGCATAGT^CACCCTTTTACAACAAGAGAACCAAATCAGTTTC^ 

ACTTCTCCAGATCCAAACCCTAGAATCAAAGAGCTGTCTCACCCTTGAAACAAACATCAA 

AAGAGATCCGGGTCAAACAGATGACCCGGAAAAGGATCCAAGAACAGAAAACGGAGCAGT 

AACGGTCAAAGAAAAAAGAAAACGGAAACGTACAAGAGCTCCAAAGAACAAAGACGAAGT 

TGAAAACCAAAGGATGACTCACATTGCCGTCGAACGTAATCGAAGACGACAAATGAACGA 

ACACTTAAACTCTCTCCGATCTCTC^TGCCTCCTTCGTTTCTTCAACGGGGTGACQ^AGC 

TTCGATTGTAGGAGGGGCAATAGATTTCATCAAGGAACTAGAGCAACTCTTGCAATCTCT 

AGAAGCTGAGAAACGAAAGGATGGAACTGATGAAACTCCTAAAACGGCGTCGTGTTCTTC 

ATCTTCGTCTCTTGCATGCACTAACTCTTCTATTTCTAGCGTGTCTACGACGTCGGAAAA 

TGGATTTACGGCGAGATTCGGCGGTGGAGATACGACAGAAGTGGAGGCTACGGTGATACA 
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GAACCATGTGAGCTTAAAAGTTCGGTGTAAGAGAGGAAAACGACAGATCTTAAAAGCTAT 
TGTCTCGATTG^GAACTAAAGCTTGCGATTCTACATCTCACTATCTCTTCTTCCTTTGA 
CTTTGTCATCTACTCTTTCAATCTCAAGATGGAAGATGGTTGTAAATTAGGATCAGCAGA 
TGAGATAGCGACAGCCGTTCATCAGATCTTCGAGCAAATCAACGGTGAAGTCATGTGGTC 
AAATCTTAGTCGAACTTAGTTGACTTTTGACTCCTAGTAACGTGTGTAAACTTTAGGTTA 
C AAAGAAAAGGG ACGTGATATAAATAAGAAAAAC CAAAG AG GTG AAATTTTGGGAGTTTT 
AATTATTATCTTATACTTTTTGGATTTTAGATTAGTAGCAAACTCGCAGTGTTCTACGAT 
GACATTATTATTGGTCACATGAAGGTTTAGGTTAAAAAAAAA 

>G1226 Amino Acid Sequence (domain in AA coordinates : 115-174) 

MSGLMSFGELEDQFGQISDTTMEEKIPFLQMLQCIEHPFTTTEPNQFLQSLIiQIQTLESK 

SCLTLETNIKM3PGQTDDPEKDPRTENGAV 

ERNR^QMNEHLNSIiRSIiMPPSFLQRGDQASIVGGAIDFIKELEQLliQSLEA^ 

ETPKTASCSSSSSIiACTNSSISSVSTTSENGFTARFGGGDTTEVEATVIQNHVSIiKVRCK 

RGKRQIIiKAIVSIEELKLAILHLTISSSFDFVIYSFl^KMEDGCKLGSADEIATAVHQIF 

EQINGEVMWSNL3RT* 

>G1451 (124 . .2559) 

TTTGTACTTCCGGAGCTAAAGAGTTATAGCTACTGTAGTAGCTGGAAGTGAAGAAGATTT 

TTTAATAGATTGTACGGAAAAATTAGGGTTTTCAAAGTl^GGTTTCTTGAAGTTGAATTA 

GACATGAAGCTGTCAACATCTGGATTGGGTCAACAGGGTCATGAAGGAGAGAAGTGTCTG 

AATTCTGAGCTATGGCATGCTTGTGCTGGACCATTAGTCTCTCTTCCATCATCTGGTAGT 

CGAGTTGTTTACTTTCCACAGGGTCACAGTGAACAGGTAGCTGCTACAACTAATAAGGAA 

GTTGATGGTCACATACCCAATTACCGAAGCCT^ 

AATGTTAGAATGGATGCAGATGTTGAGACGGATGAAGTC 

CCATTGACACCGGAGGAGCAGAAGGAAACATTTGTACCGATTGAGTTGGGGATACCGAGT 
AAGCAACCTAGTAATTATTTTTGTAAGACTCTCACAGCTAGTGATACCAGTACACATGGA 
GGGTTTTCTGTTCCTAGACGTGCTGCTGAGAAAGTGTTTCCTCCATTGGATTACACACTG 
CAGCCACC^GCTC^^GAACTGATTGCAAGGGATCTCCATGATGTTGAATGGAAGTTTAGG 
CATATCTTTCGGG<^CAGCCGAAACGGCATCTCCTAACTACTGGATGGAGTGTCTTTGTC 
AGTGCCAAGCGACTAGTAGCTGGAGATTCTGTCATTTT 

CTCTTTTTGGGAATTCGTCATGCCACTCGGCCGCAGACTATTGTACCATCATCTGTTTTA 
TCTAGTGATAGCATGCATATTGGACTCCTTGCTGCTGCTGCACATGCTTCTGCAACTAAT 
AGCTGTTTCACTGTTTTCTTTCATCCAAGGGCTAGCCAATCTGAGTTTGTGATAGAACTT 
TCCAAGTACATTAAAGCCGTTTTTCACACGCGTATTTCAGTTGGGATGCGCTTTCGCATG 
CTCTTCGAGACAGAAGAGTCGAGTGTCCGCAGGTACATGGGTACTATAACTGGTATTAGT 
GATCTAGATTCTGTTCGTTGGCCAAACTCTCATTGGCGATCTGTGAAGGTTGGTTGGGAT 
GAATCGACTGCAGGGGAGAGACAGCCAAGGGTTTCTTTATGGGAGATTGAGCCTCTGACT 
ACCTTTCCTATGTATCCATCTCTTTTTCCTCTCAGACTAAAACGTCCATGGCATGCTGGC 
ACATCATCTTTGCCTGATGGAAGGGGTGATTTGGGAAGTGGTCTAACATGGCTAAGAGGG 
GGAGGTGGAGAGCAGCAAGGTTTGCTTCCTCTAAATTATCCATCTGTTGGTTTGTTTCCA 
TGGATGCAAGAAAGGCTGGATCTCAGTCAAATGGGGACTGATAATAATCAGCAATACCAA 
GCAATGTTAGCTGCTGGGTTGCAGAACATCGGCGGTGGAGATCCTTTAAGACAGCAGTTT 
GTACAGCTGCAAGAGCCTCACCACCAATATCTTCAACAATCAGCTTCCCATAATTCTGAT 
TTGATGCTTCAGCAGCAACAGCAGCAACAAGCGTCACGCCATCTCATGCATGCTCAAACA 
CAGATTATGAGTGAGAATCTTCCGCAGCAGAATATGCGACAAGAAGTTAGTAACCAACCA 
GCTGGAGAGCAGGAACAGCTACAGCAACCGGACGAAAATGCATATCTTAATGCTTTCAAA 
ATGCAAAATGGCCATCTTCAACAGTGGCAGCAGCAATCAGAGATGCCATCTCCCTCGTTC 
ATGAAGTCAGATTTTACTGACTCAAGCAACAAATTTGCAAGAACTGCTAGTCCGGCTTCT 
GGAGATGGCAATCT^TTGAATTTTTCTATAACCGGTCAGTCTGTACTCCCTGAGCAGTTA 
ACAACAGAGGGCTGGTCTCCAAAAGCATCCAACACTTTTTCTGAACCGTTGTCACTTCCA 
CAAGCCTATCCTGGGAAGAGTCTTGCTCTAGAACCCGGAAATCCGCAGAATCCCTCTCTT 
TTCGGTGTTGATCCCGACTCTGGACTCTTCCTCCCCAGTACGGTTCCCCGCTTTGCTTCT 
TGA.TCAGGAGATGCTGAAGCTTCCCCTATGTCACTAACAGATTCAGGATTTCAGAATTCC 
TTATATAGCTGCATGGAAGACACAACTCATGAGTTATTGCATGGAGCTGGACAGATTAAC 
TCGTCCAACCAAACCAAGAACTTTGTAAAGGTTTATAAATCTGGTTCGGTTGGGCGTTCA 
TTAGACATCTCCCGATTCAGCAGCTACCACGAGCTGCGAGAAGAGTTAGGGAAGATGTTT 
GCTATCGAAGGGTTGTTGGAAGACCCCCTTAGATCAGGCTGGCAGCTTGTATTCGTTGAC 
AAGGAAAATGATATTCTTCTCCTTGGTGATGACCCATGGGAGTCATTTGTGAATAACGTT 



209 



WO 03/013227 PC17US02/25805 

210/286 



TGGTACATAAAGATACTATCACCAGAAGATGTGCATCAAATGGGAGATCATGGAGAAGGC 

AGTGGTGGGTTATTCCCGCAA7VACCCGACCCATCTCTAGAAGCTGCTTCGGTGTTAGTCT 

CATCATGCTACAACGCGGGAGCCCTTTGTTTCCCATTTGAAGTCGTTTCCACTCATCTTT 

ATATGCCATTCGTTCGCATCTCTCTCGTTTTGACGTTTTTAGAAAGAAAC^TAATCATAT 

TTGTGAGTATGGGTCCTGAAACTTTAGGACGTACTTTAGCTTGTATTAGACAGACACTCT 

CGTCATAAACATAAGAACCTTTATGTAGCTGTCTCAGGGTAACTAAACTTTTCTAG 

>G1451 Amino Acid Sequence (domain in AA coordinates: 22-357) 

MKLSTSGLGQQGHEGEKCIaNSEIiWHACAGPLVSLPSSGSRVVYFPQGHSEQVAATTNKEV 

DGHIPNYPSLPPQLICQLHNVTMHADVE^ 

QPSNYFCKTLTASDTSTHGGFSVPRRAAEKVTPPLDYTLQPPAQELIARDL 

IFRGQPKRHLLTTGWSVFVSAKRLVAGDSVIFIRI^KNQLFLGIRHATRPQTIVPSSVIjS 

SDSMHIGLLAAAAHASATNSCFTVFFHPRASQSEFVIQLSKYII^ 

FETEESSVRRYMGTITGISDLDSWWPNSHWRSVKVGWDESTAGERQPRVSLWEIEPLTT 

FPMYPSLFPLRLKRPWHAGTSSIiPDGRGDLGSGLTWLRGGGGEQQGLLPLNYPSVGLFPW 

MQQRIiDLSQMGTDNNQQYQAMLAAGIjQNIGGGDPIjRQQFVQLQEPHHQYLQQSASHNSDL 

MLQQQQQQQASRHLMHAQTQIMSENGCiPQQNMRQEVSNQPAGQQQQLQQPDQNAYI^^ 

QNGHLQQWQQQSEMPSPSFMKSDFTDSSNKFATTASPASGDGI^LNFSITGQSVLPEQIjT. 

TEGWSPKASNTFSEPIiSLPQAYPGKSLALEPGNPQNPSLFGVDPDSGIiFLPSTVPRFASS 

SGDAEASPMSLTDSGFQNSLYSCMQDTTHEIiIJIGAGQINSSNQTKNFVKVYKSGSVGRSIi 

DISRFSSYHELREELGKMFAIEGIJjEDPLRSGWQLVFVDKIENDI^ 

YIKILSPEDVHQMGDHGEGSGGLFPQNPTHL* 

>G1478 (1..354) 

atgtgtagagggtttgagaaagaag2^agagagaagaagcgacaatggaggatgccaaaga 
ctatgcacggagagtcacaaagctccggtaagctgtgagctttgcggcgagaacgccacc 
gtgtattgtgagggagacgcagctttcctttgtac^ 

gctaattttctagctcggagacatctccggcgcgtgatctgcacgacctgtcggaagcta 

ACTCGTCGATGTCTTGTCGGTGATAATTTTAATGTTGTTTTACCGGAGATAAGGATGATA 

GCAAGGATTGAAGAACATAGTAGTGATCACAAAATTCCCTTTGTGTTTCTCTGA 

>G1478 Amino Acid Sequence (domain in aa coordinates: 32-76) 

MCRGFEKEEERRSDNGGCQRLCTESHKAPVSCELCGENATVYCEADAAFIiCRKCDRWVHS 

ANFLARRHLRRVICTTCRKLT^ 

>G1496 (116.. 1123) 

AAACCCACCAAATAACT'CAGAGCTTTTTTGCATTTTTTCCGA 

ACTTTTGGTCTCACTTTAAAAGATCATAAGTTGAAAGATTTCTGCAGAGAACAATATGTT 
GGAAGGTCTTGTCTCTCAAGAAAGCTTGTCCTTAAACTCTATGGACATGTCTGTACTTGA 
AAGGCTTAAATGGGTACAAGAGCAACAAC^GCAACTGC^ 

TAATAATTCACCTGAACTTCTTCAGATACTTCAGTTCCATGGAAGGAACAATGATGAGTT 

GTTGGAGAGTAGTTTCAGCCAATTTCAAATGCTTGGATCTGGTTTTGGACCAAACTATAA 

CATGGGTTTTGGTCCTCCACATGAATCCATTTCAAGAACAAGTAGCTGCCATATGGAAC 

TGTGGATACAATGGAGGTTTTGTTGAAGACCGGTGAAGAAACCAGAGCCGTTGCCTTGAA 

GAACAAGAGAAAACCAGAGGTTAAGACAAGGGAAGAGCAAAAGACAGAGAAGAAGATCAA 

AGTAGAGGCTGAGACAGAGTCAAGCATGAAAGGAAAATCAAACATGGGAAACACTGAAGC 

ATCTTCAGACACTTCAAAGGAGACATCGAAAGGAGCTTCAGAGAATCAGAAATTAGATTA 

TATCCACGTGAGAGCTCGTCGAGGCCAAGCCACTGACAGACACAGCTTAGCAGAAAGGGC 

GAGAAGAGAAAAGATCAGCAAGAAAATGAAATATCTGCAAGATATTGTGCCTGGATGCAA 

TAAGGTCACAGGAAAAGCTGGTATGCTTGATGAGATCATCAATTATGTTCAATGTCTCCA 

AAGACAAGTCGAGTTCCTGTCGATGAAACTTGCTGTCTTGAACCCGGAACTAGAGCTTGC 

CGTGGAAGATGTATeCGTAAAACAGGCTTACTTTACAAATGTAGTTGCTTCAAAGCAATC 

AATAATGGTTGATGTGCCATTGTTTCCGTTAGACCAGCAAGGATCTCTAGATTTGTCTGC 

GATAAACCCGAACCAAACGACATCTATCGAAGCTCCATCTGGAAGCTGGGAAACTCAATC 

ACAGAGTCTCTACAACACATCTAGCCTCGGTTTTCATTACTAAGCAAGATTCATTGAAAC 

AACATGGTTGACATCAATCAATCATCAAAATCAGAAGCAAATTCTATTACATTTGC 

CAAAGTAGTAATTTCGAAATTTGGTTAATGCATTATCCTTTGATCCTTGTTTTCTGATAT 

TTAAACCAGAAGAACTGGAGATAGCAATCCAATGATCTTGTCACCA 

>G1496 Amino Acid Sequence (domain in AA coordinates: 184-248) 
MLEGLVSQESLSLNSMDMSVLERLKW^ 

EIiIjES S FS QFQMLGSGFGPNYNMGFGPPHES I SRTS S CHMEPVDTME VLLKTGEETRAVA 
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LKIJKRKPEVKTREEQKTEKKIKVEAETESSMKGKSNMGNTEASSDTSKETSKGASENQKIj 
DYIHVRARRGQATDRHSLAERARREKI SKKMKYLQD I VPGCNKVTGKAGMLDEI INYVQC 
LQRQVEFIiSMKLAVLNPEIjEIiAVEDVS VKQAYFTNVVASKQS I MVDVPLFPLDQQGS LDIj 
SAINPNQTTSIEAPSGSWETQSQSLYNTSSLGFHY* 
>G1526 (1..3090) 

ATGGGAACGAAAGTCTCAGACGATCTTGTTTCCACCGTCAGATCAGTCGTGGGTTCCGAT 

TACTCAGATATGGATATAATCAGGGCTTTACACATGGCGAATCATGATCCAACGGCTGCT 

ATCAATATAATCTTCGACACTCCAAGTTTCGCCAAACCTGATGTAGCCACTCCTACCCCG 

AGCGGCTCTAATGGAGGGAAGCGAGTTGATAGTGGATTAAAGGGCTGTACTTTTGGTGAC 

AGCGGAAGTGTTGGAGCGAATCATCGCGTGGAGGAAGAAAATGAGAGTGTTAATGGTGGA 

GGAGAAGAGAGTGTTTCAGGGAATGAGTGGTGGTTTGTTGGTTGTTCTGAATTGGCTGGG 

TTATCGACATGTAAAGGAAGGAAATTGAAGTCTGGTGATGAATTGGTGTTCACGTTTCCG 

CATAGTAAAGGATTAAAGCCTGAGACTACGCCTGGGAAGCGCGGTTTTGGGCGGGGAAGG 

CCAGCTTTGCGTGGTGCTTCTGATATCGTTAGGTTCTCTACAAAGGATTCAGGAGAGATT 

GGTAGAATACCAAACGAGTGGGCTCGGTGTCTTCTACCACTTGTGAGAGACAAGAAAATT 

AGGATAGAAGGCAGTTGCAAGTCGGCGCCTGAAGCTTTGAGCATCATGGATACAATTCTT 

CTGTCTGTAAGCGTGTACATTAATAGTTCCATGTTTCAAAAGCATAGTGCGACTTCATTT 

AAGACAGCTAGTAATACGGCAGAGGAATCAATGTTCCATCCTCTCCCAAATCTCTTTCGG 

TTACTCGGTTTGATCCCCTTTAAGAAGGCAGAGTTTACTCC^GAGGATTTTTACTCTAAG 

AAGCGACCTTTGAGTTCGAAGGATGGTTCTGCTATTCCTACTTCGTTGCTTCAATTAAAC 

AAGGTCAAGAATATGAATCAAGATGCAAACGGAGATGAAAATGAGCAGTGTATCAGCGAT 

GGTGATCTTGATAACATTGTTGGTGTTGGGGACAGTTCTGGATTAAAGGAAATGGAAACT 

CCACATACACTTCTGTGTGAGCTTCGTCCATACC^AAAGCAGGCACTTCATTGGATGACC 

CAACTGGAGAAAGGAAATTGCACTGATGAGGCAGCAACAATGCTTCACCCGTGTTGGGAA 

GCATACTGTTTAGCAGACAAGAGGGAACTGGTTGTCTACCTGAATTCTTTTACTGGTGAT 

GCn^ACAATACACTTCCCTAGCACACTTCAAATGGCZAAGAGGAGGAATA 

ATGGGTCTTGGAAAGACTGTAATGACCATATCCCTTTTGCTTGCCCATTCTTGGAAAGCT 

GCATCAACTGGGTTTCTATGCCCCAACTATGAAGGAGACAAAGTGATCAGCAGTTCTGTA 

GATGATCTCACTAGTCCCCCGGTGAAGGGAACCAAATTTCTAGGCTTTGATAAGAG 

CTTGAACAAAAAAGTGTACTTCAAAATGGTGGTAACCTGATTGTATGTCCGATGACACTT 

TTAGGACAGTGGAAGACAGAGATTGAAATGCATGCAAAGCCTGGGTCTCTATCTGTCTAT 

GTTCAOTATGGGCAAAGCAGGCCGAAGGATGCAAAACTTCTTTCCCAGAGTGATGTGGTA 

ATCACC^CATATGGAGTTCTAACATCCGT^ATTCTCGCAAGAGAACTCAGCAGACCATGAA 

GGAATTTATGCAGTTCGATGGTTTAGGATTGTTCTTGACGAGGCACATACCATCAAAAAC 

TCAAAAAGCCAAATTTCCTTGGCTGCTGCAGCTCTGGTTGCTGATAGGCGTTGGTGTCTT 

ACGGGTACTCCTATTCAGAACAATCTGGAGGATTTATACAGCCTTCTACGGTTTTTGAGG 

ATTGAACCATGGGGAACTTGGGCATGGTGGAATAAACTTGTCCAAAAGCCATTTGAAGAG 

GGTGATGAGAGAGGGTTAAAGCTAGTGCAGTCTATCTTAAAACCTATCATGCTTAGGAGA 

ACAAAGTCTAGCACAGACCGAGAAGGAAGGCCGATTCTTGTTCTACCCCCTGCTGATGCA 

CGGGTCATTTACTGTGAACTTTCGGAGTCTGAGAGGGATTTCTACGACGCGCTATTTAAA 

AGATCCAAGGTCAAATTTGATCAATTTGTTGAACAAGGCAAAGTTCTTCATAACT 

TCGATCCTGGAACTGCTTTTGCGTCTTCGACAATGTTGTGATCACCCATTTTTAGTAATG 

AGTCGAGGGGATACAGCGGAATACTCTGATCTGAATAAGCTTTCTAAACGTTTCCTTAGT 

GGAAAGTCTTCTGGCTTAGAAAGGGAAGGAAAAGATGTACCGTGAGAGGCTTTTGTTCAG 

GAGGTGGTAGAGGAACTGCGCAAAGGAGAGCAAGGAGAGTGTCCAATATGCCTTGAAGCA 

CTTGAGGATGCTGTATTAACGCCATGTGCTCATAGATTATGTCGTGAGTGTCTCTTGGCA 

AGTTGGAGAAATTCTACTTCTGGGTTATGTCCTGTGTGTAGGAACACTGTAAGCAAACAA 

GAACTCATCACAGCACCAACCGAAAGTAGATTCCAGGTTGACGTGGAAAAGAATTGGGTG 

GAATCATCGAAAATCACTGCTCTTCTGGAAGAGCTTGAAGGTCTTCGTTCTTCAGGCTCT 

AAGAGCATTCTCTTTAGCCAGTGGACCGCTTTCCTCGATCTCCTCCAAATTCCCCTCTCT 

CGGAATAACTTTTCATTTGTCCGTCTTGATGGCACGCTAAGTCAGCAGCAACGAGAGAAG 

GTCCTTAAAGAATTTTCCGAAGATGGCAGTATCCTGGTACTGTTGATGTCTCTAAAAGCT 

GGTGGCGTTGGGATAAATCTAACAGCTGCGTCCAATGCTTTTGTCATGGATCCATGGTGG 

AACCCAGCGGTAGAGGAACAAGCTGTTATGCGTATTCATCGTATAGGGCAAACTAAGGAA 

GTCAAAATCAGAAGATTCATCGTTAAGGGAACGGTTGAAGAGAGAATGGAGGCGGTTCAG 

GCGAGGAAGCAGAGAATGATCTCTGGGGCTTTAACCGATCAAGAAGTACGAAGTGCACGT 

ATAGAGGAACTCAAGATGTTATTTACCTGA 
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>G1526 Amino Acid Sequence (domain in AA coordinates: 493-620, 864-1006) 

MGTKVSDDLVSTVRS WGSDY SDMD I IRALHMANHDPTAAIN 1 1 FDTPS FAKPDVATPTP 

SGSNGGKRVDSGLKGCTFGDSGSVGANHRVEEENESVNGGGEESVSGNEWWFVGCSEIiAG 

LSTCKGRKLKSGDELVFTFPHSKGLKPETTPGKRGFGRGRPALRGASDIVRFSTKDSGEI 

GRI PNEWARCLLPLVRDKKI RIEGSCKSAPEALS IMDTI LIjSVSVYINSSMFQKHSATS F 

KTASNTAEESMFHPLPNIiFRIiLGLIPFKKAEFTPEDFYSKKRPLSSKDGSAlPTSLLQI^ 

KVKimNQDANGDENEQCISDGDLDNIVGVGDSSGLKEMETPHTLLCELRPYQKQALHWMT 

QLEKGNCTDEAATMLHPCWEAYCLADKRELVVYLNSFTGDATIHFPSTLQMARGGILADA 

MGLGKTVMTISLLLAHSWKAASTGFLCPNYEGDOT^ 

LEQKSVLQNGGNIjIVCPMTLLGQWKTEIEMHAKPGSLSVYVHYGQSRPKDAKL 
ITTYGVIjTSEFSQENSADHEGIYAWWFRIVLDEAHTIKtt^ 

TGTPIQNmiEDLYSIiLRFLRIEPWGTWAWWNKLVQKPFEEGDERGLKDVQSIIiKPIMLRR 
TKSSTDREGRPILVIiPPADARVIYCELSESERDFYDALFKRSKNnCFDQFVEQGKVLHNYA 
SILEIjLLRLRQCCDHPFIjVMSRGDTAEYSDLNKLSKRFLSGKSSGIjER 
EVVEELRKGEQGECPICLEALEDAVLTPCAHRIjCRECLIjASWRNSTSGLCPVCRNTVSKQ 

elitaptesrfqvdvekot^sskitalleeleglrssgsksilfsqwtafldllqi 

RMNFSFVRLDGTLSQQQREKVLKEFSEDGSILVIjIjMSBKAGGVGINIjTAASNAFVMDPW 

npaveeqavmrihrigqtkevkirrfivkgt^ 

IEELKMIiFT* 
>G1543 (1..828) 

atgataaaactactatttacgtacatatgcacatacacatataaactatatgctctatat 

catatggattacgcatgcgtgtgtatgtataaatataaaggcatcgtcacgcttcaagtt 

tgtctcttttatattaaactgagagttttcctctcaaactttaccttttcttcttcgatc 

ctagctcttaagaaccctaataattcattgatcaaaataatggcgattttggcggaaaac 

tcttcaaacttggatcttactatctccgttccaggcttctcttcatcccctctctccgat 

gaaggaagtggcggaggaagagaccagctaaggctagacatgaatcggttaccgtcgtct 

gaagacggagacgatgaagaattcagtcacgatgatggctctgctcctccgcgaaagaaa 

ctccgtctaaccagagaacagtcacgtcttcttgaagatagtttcagacagaatcatacc 

cttaatcccaaacaaaaggaagtacttgccaagcatttgatgctacggccaagacaaatt 

gaagtttggtttcaaaaccgtagagcaaggagcaaattgaagcaaaccgagatggaatgc 

gagtatctcaaaaggtggtttggttcattaacggaagaaaaccacaggctccatagagaa 

gtagaagagcttagagccataaaggttggcccaacaacggtgaactctgcctcgagcctt 

actatgtgtcctcgctgcgagcgagttacccctgccgcgagcccttcgagggcggtggtg 

ccggttccggctaagaaaacgtttccgccgcaagagcgtgatcgttga 

>G1543 Amino Acid Sequence (domain in AA coordinates: 135-195) 

MIKLLFTYICTYTYKLYALYHITO^ SSI 

LALKNPNNSLIKIMAILPENSSNIj^^ 

EDGDDEEFSHDDGSAPPRKKLRLTREQSRLLEDSFRQNHTL 
EWFQNRRARSKLKQTEMECEYLKRWFGSLTEEKHRLHREVEELRAIKVG 
TMCPRCERVTPAASPSRAWPVPAKKTFPPQERDR* 
>G162 (101.- 619) 

AGACATACAAC^CCAAAATCTTCTTCTTCACCAACATATTCACTT 

ACGAGAGGTTCTCTCTTATTCGTACCGTTTTAGCAAACAAATGGGTCGGAGAAAGATCAA 

GATGGAGATGGTTCAGGACATGAACACACGACAGGTTACCTTTTCAAAACGGAGGACTGG 

TTTGTTCAAGAAGGCGAGCGAGTTAGCCACGCTCTGCAACGCTGAGTTGGGCATCGTTGT 

CTTTTCACCAGGAGGCAAGCCTTTCTCCTACGGGAAACCGAATCTTGATTCTGTTGCAGA 

GCGATTCATGAGAGAATATGATGATTCAGACAGTGGCGATGAAGAAAAAAGTGGTAATTA 

GAGGCCTAAACTGAAGAGGCTGAGTGAACGTCTCGATTTGCTCAACCAAGAGGTTGAAGC 

TGAGAAGGAACGAGGCGAGAAGAGTCAGGAGAAGCTTGAATCTGCTGGGGATGAGAGATT 

CAAGGAGTCCATTGAGACGCTTACCCTCGATGAACTCAATGAATAC^U^GATAGGCTTCA 

GACAGTCCATGGTAGGATTGAAGGTCAAGTCAATCACTTGCAGGCTTCGTCTTGCCTCAT 

GCTTCTCTCCAGAAAATAGCTAGACCGACTTGTTAGAGTTACATTCTATTTTTTGTATCA 

GCCTACAGAACTTACCAACACATGAAAGTTATTGCTGGTGTAGAATTTTCTGTCATCTAT 

GGGGTGTGACTTTCTATTTGACATCAAATGAAAATGTACCTGGAAATTTGTCTGTATTAA 

TCTCAAGTGTACTTGCTAAACTTGATCAGCTTTTTCGCAAAAAAAA 

>G162 Amino Acid Sequence (domain in AA coordinates: 2-57) 

MGRRKI KMEMVQDMNTRQVTFS KRRTGLFKKAS ELATLCNAELG I WFS PGGKPFS YGKP 
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NLDSVAERFMREYDDSDSGDEEKSGNYRPKLKRLSERLDLLNQEVEAEKERGEKSQEKL.E 
S AGDERFKES I ETLTLDELNE YKDRLQTVHGR I EGQVNHIjQAS S CLMLLS RK * 
>GX640 (168. .1196) 

TTCGCCAGATCCTTCCTCTATATAAGGAAGTTCATTTCATTTGGAGAGGTTTCGCTGACA 
AGCTGCTCTAGCTTATCTGGTACCGTCGACCTCTCACTCAAGGGTCCAAAAGTGTTTTCT 
CTTTTTCAGTTTCTCTTTCTCTTTTTGACAGAAGAGACCGAGAAGCAATGGGAAGGGCTC 
CGTGTTGTGAGAAAATCGGGTTGAAGAGAGGGAGATGGACAGCCGAGGAAGATGAGATCC 
TCACCAAGTATATTCAGACCAATGGTGAAGGTTCTTGGCGATCTTTGCCTAAGAAAGCTG 
GATTGTTGAGATGTGGAAAGAGCTGTAGACTAAGGTGGATAAACTACTTAAGAAGAGACT 
TAAAAAGAGGAAATATTACTTCCGACGAAGAAGAAATAATCGTCAAGTTGCATTCCCTTC 
TCGGCAACAGATGGTCACTTATTGCAACACATCTACCAGGAAGAACAGACAACGAAATTA 
AAAACTATTGGAACTCACATCTCAGCCGCAAAATCTATGCCTTCACTGCCGTTTCCGGAG 
ATGGACACAATCTACTCGTCAACGATGTAGTCTTGAAGAAATCTTGTTCATCGTCTTCTG 
GAGCCAAGAACAATAACAAGACCAAGAAGAAGAAGAAGGGAAGGACTAGTAGGTCATCCA 
TGAAGAAACACAAGCAAATGGTGACGGCCTCACAATGTTTCTCACAACCTAAGGAGCTAG 
AGAGTGATTTCAGTGAGGGAGGGCAAAATGGTAATTTTGAAGGAGAGTCTTTGGGGCCTT 
ATGAGTGGTTGGATGGTGAGTTAGAACGGCTCTTGAGTAGTTGTGTCTGGGAATGCACTA 
GTGAAGAGGCTGTGATTGGAGTAAATGATGAAAAGGTGTGTGAGAGTGGGGACAATAGTA 
GTTGTTGTGTTAATTTGTTTGAAGi\AGAACAAGGAAGCGAGACAAAGATTGGTCACGTAG 
GAATCACAGAGGTTGATCATGATATGACGGTGGAAAGAGAAAGAGAGGGAAGTTTTTTAA 
GTTCGAATTCAAATGAAAATAATGATAAAGATTGGTGGGTTGGTCTATGTAATTCTTCAG 
AAGTTGGGTTTGGGGTTGATGAGGAGTTGCTTGATTGGGAGTTTCAAGGTAATGTCACTT 
GTCAAAGTGATGATCTATGGGATCTCTCAGATATTGGAGAGATAACATTGGAGTGATTGT 
ACCGAGCAAGTGGATTGGCGGCCGCTCTAGACAGGCCTCGTACCGGATCTCTAGCTAGAG 
CTTTCGTTCGTATCATCGGTTTCGACAACGTTCGTCAAGT 

>G1640 Amino Acid Sequence (domain in AA coordinates: 14-115) 

MGRAPCCEKIGLKRGRV^AEEDEILTKYIQTNGEGSWRSLPKKAGLLRCGKSCRLRWINY 

LRRDLKRGNITSDEEEIIVKLiHSLLGNRWSLIATHLPGRTDNEIKNYWNSH^ 

AVS GDGHNIiLVNDWLKECSCSS S SGAKNNNKTKKKKKGRTSRS SMKKHKQMVTASQCFSQ 

PKEDESDFSEGGQNGNFEGESIiGPYEWLDGELERLLSSCWECTSEEAVIGVNDEKVCES 

GDNS S CCVIJIjFEEEQGSETKIGB^GI TEVDHDMTVEREREGS FLS SNSNKNWDKD WWVGL 

CNSSEVGFGVDEELLDWEFQGNVTCQSDDLV7DLSDIGEITLE* 

>G1644 (1..348) 

ATGAAATTGATTGATTGGAAAGACTGTGCTTTGATGACTTACACCGAACTCATTTTGGGT 
TTCTGCAATGTTTTAATGTTGATCTGCAGGAGGACTAGTGGACCTATGAGACGAGCAAAA 
GGTGGTTGGACTCCAGAGGAGGATGAGACACTTAGACGAGCAGTTGAAAAGTATAAGGGG 
AAGAGGTGGAAGAAAATAGCGGAATTTTTCCCAGAGAGAACACAAGTCCAATGCTTGCAC 
AGGTGGCAGAAAGTTCTTAATCCAGAGCTTGTTAAAGGACCTTGGACTCAAGAGGTTCTC 
TTATCATTTTCATGTTCTGAAACTTTTTTTGGTTTTCATTTTACGTAA 

>G1644 Amino Acid Sequence (conserved domain in AA coordinates : 3 9-102) 
MKLIDWKDCALMTYTELILGFCNVXMLICRRTS 

KRWKKIAEFFPERTQVQCIiHRWQKVLNPELVXGPWTQEVXjLSFSCSETFFGFHFT* 
>G1646 (34.. 786) 

GATCTTTTGATCCAATCACAAGGGAAAGATCCAATGGAGAATAACAACAACAACAACAAC 

CAGCAAC^CCACC^^CCTCCGTCTATCCACCTGGCTCCGCCGTCAC^CCGTAATCCCT 

CCTCCACCATCTGGATCTGCATCAATAGTCACCGGAGGAGGAGCGACATACCACCACCTC 

CTCCAGCAACAACAGCAACAGCTTCAAATGTTCTGGACATACCAGAGACAAGAGATCGAA 

CAGGTAAACGATTTCAAAAACC^TCAGCTCCCTCTAGCTCGTATCAAAAAAATCATGAAA 

GCTGATGAAGATGTGCGTATGATCTCCGCCGAAGCACCGATTCTCTTCGCGAAAGCTTGT 

GAGCTTTTCATTCTCGAACTTACGATTAGATCTTGGCTTCACGCTGAAGAGAACAAACGT 

CGTACGCTTCAGAAAAACGATATCGCTGCTGCGATTACTAGAACCGATATCTTCGATTTC 

CTTGTTGATATTGTTCCTAGGGAAGAGATCAAGGAAGAGGAAGATGCAGCATCGGCTCTT 

GGTGGAGGAGGTATGGTTGCTCCCGCCGCGAGCGGTGTTCCTTATTATTATCCACCGATG 

GGACAACCGGCGGTTCCTGGAGGGATGATGATTGGAAGACCGGCGATGGATCCTAGCGGT 

GTTTATGCTGAGCCTCCTTCTCAGGCATGGCAAAGCGTTTGGCAGAATTCAGCTGGT 

GGTGATGATGTGTCTTATGGAAGTGGAGG2\AGTAGCGGCCATGGTAATCTCGATAGCCAA 

GGGTAAGTGAATTCTAGTAG 
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>G1646 Amino Acid Sequence (domain in AA coordinates: 72-162) 

ITONltfNNNlENQQPP PTS VYPPGS AVTTVI P PPPSGS AS I VTGGG ATYHHLLQQQQQQLQMF 

WTYQRQEIEQVNDFKNHQLPLARIKKIMKADEDVRMISAEAPILFAKACELFILELTIRS 

WliHAEENKRRTLQKNDIAAAITRTDIFDFLVDIVPREEIKEEEDAASALGGGGIW 

GWYYYPPMGQPAVPGGMMIGRPAMDPSGVYAQPPSQAWQSVWQNSAGGGDDVSYGSGGS 

SGHGNLDSQG* 

>G1672 (239. .1399) 

CCATTCCTGACGTCCGGGATGACGCACAATCCCACTATCCTTCGCAAGACCCTTCCTCTA 
TATAAGGAAGTTGATTTCATTTGGAGAGGACACGCTGACAAGCTGACTCTAGCAGATCTG 
GTACCGATCACTCCCGTCTTTATCAAATTCTTCTTCCTCTTACATTTTCCCTATCCAATC 
GATCTCACGCAGATCTGATCAATTTCTGATCAAATCATTTAGAGATCAAAAGAAAACTAT_ 
GAAGAATAGTAAATGTAACCTCATAGATTCAAAGCTCGAAGAACATCATCATCTTTGCGG 
ATCAAAACATTGTCCTGGATGTGGTCGCATGATTCAAGCTGCTACTAAACCAAATTGGGT 
TGGATTGCCGGCAGGAGTGAAATTCGATCCGACAGATGAAGAACTTATAGAAGATTTAGA 
AGCAAAAGTGAAGGGAAAAGAAGAAAATAAGAAATGGTCGTCGTCTCATCCACTTATAGA 
TGAATTTATTCCCACCATTGATGGAGAAGATGGAATATGTTACACTCATCCTCAGAAGCT 
TCCAGGGGTGACAAGAGATGGCTTGAGGAAACACTTCTTCCACAAACCATCAAGAGCTTA 
CACAACCGGAACAAGAAAACGACGTAA7VATAATTCAAACCGATCACGACTCTGAGTTAAC 
CGGATCATCAGAAACCAGGTGGCACAAAACGGGCAAAACAAGACCGGTTATGATCAACGG 
TCAACAAAGAGGATGCAAG7^GATATTAGTACTCTACACAAACTTCGGCAAGAATCGTCG 
ACCGGAGAAAACAAATTGGGTGATGCATCAATATCATTTAGGGATTAATGAGGAAGAGAG 
AGAAGGAGAACTTGTGGTCTCCAAGATATTTTATCAGACACAACCAAGACAGTGTGTTAG 
TAATACTAATTGGTCTGATCACCATGGTTCCAAGGACGTGATCGGAATTGGTGTCGGAGA 
TGAGATTTCCAGCGTAGCTGCCACGTTGCAGAGTCTTGGCTCCGGTGACGTCGTTTCTAG 
GGTTAATATGCATCCCCATACAAGATCCTTTGATGAGGGGACAGCCGAAGCTTCAAAGGG 
AAGAGAGAACCAGCATGTGTCTGGCACGTGCGAGGAAGTACATGATGGGATCATAACATC 
ATCAATGTCATCTCATCATATGATTCATGATCATCATAATCAACATCATCAAATCGGAGA 
TAGAAGAGAATTTCACATGTCATC^TCATATCCCATGACCCCTACTATCAC^TGAC^C^ 
TGAGTCAATCTTCCATGTTACAAGTACTATGCCCTTTCAGCGGCAGCAATTAAGGGGTCG 
GTCGTCTGGTTCGGGATTAGAAGACCTAATTATGGGTTGTACCACAGCTACGTGTACAGA 
AGACAATAATCACAAATGATTAAATTCGCAGGAGCATTCAGAAGCAAACCCTCAGCGAAA 
TGCAGAGTGGTTAACGTTTCCACAATTCTGGAACCAAGCCGAATCAGATGATCAAAACCG 
AAGATTTTAACAGAAC CAAAAGGAAG CAGAGAAATCTTGCAAAAAGCTCCTG CTTAGCTG 
TTGATCAATGCCGGAAATGCTGAGCTATGACTGACTAGTCTCTGCCATTTAACTTACAAT 
ATCACCAGAGGTTGCGATGAATGTTGATTCGCTCAAAGGAGAGCGGCCGCTCTAGACAGG 
CCTCGTACCG 

>G1672 Amino Acid Sequence (conserved domain in AA coordinates: 41-194) 
MKNSKC^IDSKLEEHHHLCGSKHCPGCGRMI 
EAKVKGKEENKKWSSSHPLIDEFIPTIDGEDGICYTHPQKLP^^ 
YTTGTRKRRKIIQTDHDSELTGSSETRWHKTGKTRPVMI^ 

RPEKTNWVMHQYHLGINEEEREGELWSKI FYQTQPRQCV SNTNWSDHHGS KDVIGIGVG 

DEISSVAATLQSLGSGDWSRVNMHPHTRSFDEGTAEASKGRENQHVSGTCEEVHDGIIT 

SSMSSHHMIHDHHNQHHQIGDRREFHMSSSYPMTPTITSQHESIFHVTSTMPFQRQQLRG 

RSSGSGLEDLIMGCTTATCTEDNNHK* 

>G1677 (24.. 1037) 

CAGTACTAATTCTGTGTGTGTTAATGGTTCTAGTTATGGATGATGAAGAGAGTAACAACG 

TTGAAAGATATGACGACGTCGTATTGCCAGGGTTTAGGTTCCATCCCACTGATGAAGAAC 

TCGTAAGTTTCTACxTGAAACGGAAGGTTTTACACAAATCTCTTCCCTTTGATCTCATCA 

AGAAAGTCGACATTTACAAATACGATCCATGGGACCTCCCAAAGCTTGCAGCGATGGGGG 

AAAAAGAGTGGTACTTTTATTGTCCTAGAGACAGGAAATACCGCAAGAGCACAAGACC 

ACCGAGTAACTGGAGGTGGCTTCTGGAAAGCAACCGGAACAGACCGGCCTATATACTCAT 

TGGACTCGACTCGATGCATCGGTTTGAAGAAATCACTTGTGTTCTACCGTGGTCGAGCTG 

CTAAAGGAGTCAAAACCGATTGGATGATGCATGAATTTCGTCTCCCTTCTCTCTCTGACT 

CTCATCACTCATCATATCCCAATTACAATAACAAGAAGCAACACCTTAACAATAACAACA 

ACAGCAAGGAGCTTCCTTCAAACGATGCTTGGGCGATATGTAGAATATTTAAGAAGACAA 

ATGCAGTATCCTCAGAAAGATCAATCCCACAATCTTGGGTTTATCCAACGATTCCTGACA 

ACAATCAACAGTCACACAACAACACCGCAACTCTCTTAGCTTGATCAGACGTTCTCAGCC 
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ACATATCAACAAGACAAAACTTTATTCCTTCTCCAGTCAACGAACCCGCAAGCTTCACAG 
AATCAGCTGCTTCTTACTTCGCGTCTCAGATGCTCGGAGTCACGTACAATACAGCCAGAA 
ACAACGGAACAGGGGATGCTCTGTTTCTGAGAAACAATGGAACAGGGGATGCTCTGGTTC 
TGAGCAACAATGAGAATAACTACTTCAACAACTTGACTGGAGGGTTGACTCATGAGGTTC 
CGAATGTAAGATCAATGGTGATGGAGGAGACTACGGGGAGTGAGATGTCGGCGACGTCGT 
ATTCCACTAACAATTAAGATCATAGTACTATTAACACTTGAATTAGTGTAGACGTTGATC 
ATCGCTAATATGTATTAATTTTTCTTGTCTTACTATAAACGAAAAAAAAA 

>G1677 Amino Acid Sequence (conserved domain in AA coordinates : 17-181) 
MVXVMDDEESNNV^RYDDWIj^^ 

DPWDLPKLAAMGEKEWYFYCPRDRKYXWSTRPNRVTGGGFWKATGTDRPI 

LKKS LVF YRGRAAKG VKTDWMMHE FRL P S L SD SHHS S YPNYNNKKQHLNNNNNS KEL P SN 

DAWAI CRI FKKTNAVS S QRS I PQS WVYPT I PDNNQQSHNNTATIjLAS SDVTj SHI STRQNF 

I PSPVNEPAS FTE S AAS YFAS QMLGVT YNTARJ^GTGDALFIjRNNGTGDAIjVL SNWENNY 

FN1TLTGGIJTHEVPNV1^S^TVMEETTGSEMSATSYST^^* 

>G1765 (139.. 966) 

TCCTTCGCAAGACCCTTCCTCTATATAAGGAAGTTCATTTGATTTGGAGAGGACACGCTG 

ACAAGCTGACTCTAGCAGATCTGGTACCGTCGACAAGAATGACTTGATTGGTGTTCTAAA 

GAGATCGATGTAGTGAAGATGAGTGGCGAAGGTAACTTAGGTAAGGATCATGAAGAAGAA 

AACGAAGCACCACTTCCTGGGTTCAGGTTTCATCCGACGGATGAAGAGCTTTTAGGATAC 

TATCTTCGAAGAAAAGTAGAGAACAAAACCATCAAACTCGAACTTATCAAACAGATCGAT 

ATCTATAAGTACGATCCTTGGGATCTTCCAAGAGTGAGCAGCGTCGGAGAAAAGGAGTGG 

TACTTCTTCTGCATGAGAGGTAGGAAATACAGGAATAGCGTTCGACCAAACCGAGTGACC 

GGTTCAGGTTTCTGGAAAGCCACTGGTATTGATAAACCGGTTTACTCCAATCTTGACTGT 

GTTGGTCTCAAGAAATCTCTGGTTTACTATCTTGGTTCAGCCGGTAAAGGCACCAAAACC 

GATTGGATGATGCATGAATTCCGCCTCCCCTCCACCACGAAAACCGACTCTCCAGCTCAA 

C^^GCAGAGGTATGGACACTTTGCAGAATCTTCAAACGAGTCAC^TCTCAAAGAAACC^ 

ACCATCTTACCACCAAACCGAAAACCGGTTATCACTTTAACCGACACTTGT^ 

AGCAGCTTAGATTCCGACCACACGAGCCACCGTACAGTAGATTCCATGTCCCACGAGCCG 

CCGCTTCCACAGCCACAGAATCCTTATTGGAACCAACATATAGTTGGTTTTAATCAACCG 

AC^TATACTGGTAATGATAATAACCTCCTGATGAGTTTCTGGAACGGCAACGGTGGAGAT 

TTCATAGGAGACTCAGCAAGTTGGGATGAACTTAGATCTGTTATAGATGGCAACACTAAA 

CCCTAGTAATAAAGTTTCCTTTTTTCAGCTTTGTACAAAAAGATAAAACAAACGGCAACC 

GCTCTAGACAGGCCTCGTACCGGGATCCTCTAGCTAGAGCTTTCGTTTCGTATCATCGGT 

TTCGACAACGTTCGT 

>G1765 Amino Acid Sequence (conserved domain in AA coordinates: 20-140) 

MSGEGNLGKDHEEENEAPLPGFRFHPTDEELLGYYL^ 

WDLPRVS SVGEKE WYFFCMRGRKYRNS TOP^ 

LVYYLGS AGKGTKTDWMMHE FRLPSTTKTD S PAQQAEVWTLCRI FKRVTS QRNPT ILPPN 
RKPVTTLTDTCSKTSSIiDSDHTSHRTVI^SMSHEPPLPQPQNPYWNQHIVGFNQPTYTGiro 
NNLLMS FWNGNGGDFIGDS ASWDELRS VIDGNTKP* 
>G1777 (97.. 1878) 

CTCGTACTTTATCACCTCCGTCGTTCTATAATACTCTCTTCCGTCAATCATATCATTTGT 
CGACAATTTCATTCTGATCAGTTTAAAAATTGATCCATGGATGATAATTTAAGCGGCGAG 
GAAGAAGATTACTATTACTCCTCCGATCAGGAATCTCTCAACGGGATTGATAATGATGAA 
TCCGTTTCGATACCTGTTTCTTCCCGATCAAATACTGTCAAGGTTATTACGAAGGAATCA 
CTTTTGGCTGCACAGAGGGAGGATTTGCGGAGAGTGATGGAATTGTTATCGGTTAAGGAG 
CACCATGCTCGGACTCTTCTTATACATTACCGATGGGATGTGGAGAAGTTGTTTGCTGTT 
CTTGTTGAGAAAGGQAAAGATAGCTTGTTTTCTGGTGCTGGTGTTACACTTCTTGAAAAC 
CAAAGTTGTGATTCTTCCGTTTCTGGTTCTTCTTCGATGATGAGTTGTGATATCTGCGTA 
GAGGATGTACCGGGTTATCAGCTGACAAGGATGGACTGTGGCCATAGCTTTTGCAATAAC 
TGTTGGACTGGGCATTTTACTGTAAAGATAAATGAAGGTCAGAGCAAAAGGATTATATGC 
ATGGCTCATAAGTGTAATGCTATTTGTGATGAAGATGTTGTCAGGGCTCTAGTTAGTAAA 
AGCCAACCAGATTTAGCTGAGAAGTTTGATCGTTTTCTTCTTGAGTCGTATATCGAAGAT 
AACAAAATGGTGAAGTGGTGTCCGAGTACTCCTCATTGTGGGAATGCCATACGTGTTGAG 
GATGACGAGCTCTGTGAGGTTGAATGCTCTTGTGGTTTGCAGTTCTGTTTCAGTTGTTCA 
TCTCAAGCTCACTCCCCTTGCTCTTGTGTGATGTGGGAACTATGGAGAAAGAAGTGCTTT 
GATGAGTCCGAGACTGTTAATTGGATAACTGTTCACACAAAGCCGTGTCCCAAATGTCAC 
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AAGCCTGTTGAAAAGAATGGTGGATGCAATCTCGTGACTTGTCTTTGTCGACAATCTTTT 
TGTTGGTTGTGTGGTGAAGCTACTGGAAGGGACCACACTTGGGCTAGAATCTCGGGTCAT 
AGTTGTGGTCGGTTCCAAGAAGATAAAGAGAAACAAATGGAGAGAGCGAAAAGGGATCTC 
AAGCGGTATATGCATTATCATAACCGATACAAAGCACATATCGACTCCTCCAAGCTAGAG 
GCTAAGCTTAGTAATAATATTAGTAAAAAGGTGTCTATTTCAGAAAAGAGGGAGTTACAA 
CTTAAAGACTTGAGCTGGGCTACCAATGGACTCGATCGGTTATTTAGATCAAGACGAGTT 
CTTTCATATTCATACCCTTTCGCATTTTACATGTTTGGAGATGAGCTGTTTAAAGATGAG 
ATGAGCTCTGAGGAAAGAGAAATAAAACAAAATCTOTTTGAGGATCAG(^G(^GCAGCTT 
GAGGCTAATGTTGAGAAACTTTCTAAGTTCTTGGAGGAACCTTTTGATCAATTTGCTGAT 
GATAAGGTCATGCAGATAAGGATTCAAGTCATCAATTTGTCAGTTGCGGTCGATACACTC 
TGCGAAAATATGTATGAATGCATTGAGAATGACTTGTTGGGTTCTCTGCAACTTGGCATC 
CACAAGATTACTCCATACAGATCAAACGGCATAGAACGAGCATCTGATTTTTATAGTTCC 
CAGAATTCCAAGGAAGCTGTTGGTCAGTCTTCGGATTGTGGATGGACGTCCAGGCTCGAT 
CAAGCTTTGGAGTCAGGGAAGTCGGAAGACACAAGTTGCTCTTCCGGGAAGCGTGCTAGA 
ATAGACGAAAGTTACAGAAACAGCCAAACCACCTTACTAGATTTAT^ACTTGCCAGCGGAA 
GCCATTGAGCGGAAATGAACACTTATCCTTCTTCACCTCCCAATAACACCCTTTTTGTCC 
AAATAAAGTGTGTTACCCGGATATTTATAGC^C^AAACCCAATCCCCTCTGCTTAATTTG 
TCAGTGACCTTACCTAACCCTCTTCA 

>G1777 Amino Acid Sequence (domain in AA coordinates : 124-247) 
I^DNLSGEEEDYYYSSDQESLNGIDNDE 

MEIiLSVKEHHARTLLIHYRWDVEKLFAVIiVEKGKDSLFSGAGVTLLENQSCDSSVSGSS^ 

MMS CD I CVEDVPGYQLTRMDCGHS FCNNCWTGHFTVKINEGQS KR 1 1 CMAHKCNAI CDED 

VVRALVSKSQPDLAEKJ'DRFIiI^ESYIEDNKlWKWCPSTPHCGN^ 

LQFCFSCSSQAHSPCSCVMWEIiWRKKCFDESETVNWI 

TCLCRQSFCWLCGEATGRDHTWARISGHSCGRF 

HIDSSKLEAKLSNNISKKVSISEKRELQLKDFSW 

GDELFKDEMSSEEREIKQNIjFEDQQQQLEANVEKIjSKPLEEPFDQFADDKVMQIRIQV 
IiSVAVDTLCENMYECIElTOL^^ 

CGWTSRLDQALESGKSEDTSCSSGKRARIDESYRNSQTTLIjDIiNLPAEAIERK* 
>G1793 (59.. 1783) 

AGTGATTTATTGATTAACCCAAACACAAAATAAAC^GATTTGACTCAAAAAGAAGAAA^ 

GAATTCTAACAACTGGCTTGGCO?TTCCTCTTT(^C<^AACAACTCl , TCTTTGCCTCCT^ 

TGAATACAACCTTGGCTTGGTCAGCGACCATATGGACAACCCTTTTCAAACACAAGAGTG 

GAATATGATCAATCCACACGGTGGAGGAGGAGATGAAGGAGGAGAGGTTCCAAAAGTGGC 

CGATTTTCTCGGTGTGAGCAAACCGGACGAAAACCAATCCAACCACCTAGTAGCTTACAA 

CGACTGAGACTACTACTTCCATACCAATAGCTTGATGCCTAGCGTCCAATCAAACGATGT 

CGTTGTAGC^GCTTGTGACTCCAATACTCCTAACAACAGTAGCTATCATGAGCTTCAAGA 

GAGTGCTCACAATCTACAGTCACTTACTTTGTCCATGGGGACCACCGCTGGTAATAATGT 

TGTAGACAAAGCTTCACCATCCGAGACCACCGGGGATAACGCTAGCGGTGGAGCACTAGC 

CGTTGTTGAGACGGCCACGCCAAGACGTGCATTGGACACTTTCGGACAACGAACCTCGAT 

CTATCGTGGTGTCACAAGACATCGATGGACTGGTCGATATGAGGCTCATCTATGGGATAA 

TAGTTGTAGAAGGGAAGGCCAGTCTAGGAAAGGAAGACAAGTTTACTTGGGTGGATATGA 

CAAAGAAGATAAAGCAGCAAGATCATATGATCTAGCTGCACTTAAGTACTGGGGTCCTTC 

AACTACTACTAATTTCCCCATTAC^AACTACGAGAAAGAAGTAGAGGAAATGAAGCACAT 

GACGAGACAAGAGTTCGTGGCTGCCATTAGAAGGAAAAGTAGTGGATTTTCGAGAGGCGC 

TTCGATGTATCGAGGAGTTACAAGGCATCACCAACATGGAAGATGGCAAGCAAGGATCGG 

CCGAGTCGCCGGAAACAAAGACCTCTACTTGGGAACTTTTAGCACTGAGGAAGAAGCAGC 

AGAAGCTTACGATATAGCTGCAATAAAGTTTAGAGGACTTAATGCAGTGACCAACTTCGA 

GATCAACCGGTACGACGTGAAAGCCATTCTAGAGAGTAGCACTCTTCCCATCGGAGGAGG 

CGCAGCTAAACGGCTeAAAGAAGCTCAAGCTCTTGAGTCTTCAAGGAAACGCGAGGCGGA 

GATGATAGCCCTTGGTTCAAGTTTCCAGTACGGTGGTGGCTCGAGCACAGGCTCTGGCTC 

CACCTGATCAAGACTTCAGCTTCAACCTTACCCTCTAAGCATTCAAGAACCATTAGAGCC 

TTTTCTATCTCTTCAGAACAATGACATCT^ 

CTCCTCTTTTAATCACCATAGCTATATCCAGACAGAAOT 

CAATTACTTGCAGCAACAGTCGAGCCAGAACTCTCAGCAGCTCTACAATGCGTATCTTCA 
TAGCAATCCGGCTCTGCTTCATGGACTTGTCTCTACCTCTATCGTTGACAACAATAATAA 
CAATGGAGGCTCTAGTGGGAGCTACAACACTGCAGCATTTCTTGGGAACCACGGTATTGG 
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TATTGGGTCCAGCTCGACTGTTGGATCGACCGAGGAGTTTCCAACCGTTAAAACAGATTA 
CX3ATATGCCTTCCAGTGATGGAACCGGAGGGTATAGTGGTTGGACCAGTGAGTCTGTTCA 
GGGGTCAAACCCTGGTGGTGTTTTCACTATGTGGAATGAGTAAACAAGGATCTCTTTCTT 
GCGGCACAAGGAATGGGT 

>G1793 Amino Acid Sequence (conserved domain in AA coordinates : 179-255 , 281-349) 

MNSNNWLGFPIiSPNNSSLPPHEYNLGLVSDHMDNPFQTQEWNMINPHGG^ 

ADFLGVSKPDENQSNHLVAYNDSDYYFHTNSL^ 

ESAHNLQSLTLSMGTTAGNNVVDKASPSETTGD 

IYRGVTRHRWTGRYEAHLWDNSCRRiSGQSRKGRQVYT^^ 

STTTNFPITNYEKEVEEMKHMTRQEFVAA.IRRKSSGFSRGASMYRGVTRHHQHGR 
GRVAGNKDLYLGTFS TEEE AAEAYD IAAI KFRGLNA VTNFE INRYDVKAI LES STLP I GG 
GAAKRLKRAQAIjESSRKREAEMIAIjGSSFQYGGGSSTGSGSTSSRIjQLQPYPIjSIQQPLE 
PFLSLQNNDISHYNNNNAHDSSSFNHH^ 

HSNPAIiLHGLVSTS I VDNNNNNGGS S GS YNTAAFI»GNHGI G I G S S STVGSTEEFPTVKTD 

YDMPSSDGTGGYSGWTSESVQGSNPGGVFTMWNB* 

>G180 (54.. 629) 

GTAATTACGATCTACAACAAGTGACATCGTCGTCGACGACGATTCAAGAGAATATGAACT 
TCCTCGTTCCTTTTGAAGAAACC^TGTCTTAACCTTTTTCTCTTCTTCTTCTTCCTCTT 
CTCTTTCTTCTCCTTCTTTCCCCATTCACAACTCTTCCTCCACTACTACTACTCATGCAC 
CTCTAGGGTTTTCTAATAATCTTCAGGGTGGAGGACCCTTGGGATCAAAGGTGGTTAATG 
ATGATCAGGAGAATTTTGGAGGTGGAACTAACAATGATGCTCATTCTAATTCTTGGTGGA 
GATCAAATAGTGGAAGTGGAGATATGAAGAACAAAGTGAAGATAAGGAGGAAA.CT7^AGAG 
AGCCAAGATTCTGTTTCCAAACCAAAAGCGATGTTGATGTTCTTGACGATGGCTACAAAT 
GGCGTAAATATGGTCAGAAAGTCGTCAAGAACAGCCTTGACCCCAGGAGTTATTACAGAT 
GCACACACAACAACTGTAGGGTGAAAAAGAGAGTGGAGCGACTATCGGAAGATTGTAGAA 
TGGTGATTACTACTTACGAAGGTCGTCACAACCACATTCCCTCTGATGACTCCACTTCTC 
CTGACCATGATTGTCTCTCTTCCTTTTAACATCTCTTTCTATATATCTATATATAGACAG 
TTATATGTGCACATATAGATGTGTGATATATTGCAT^ 

AGAGTATGTCATCAGATGTTATGCATATATTCTTGACTTGTTGCTTATAGTATACATATG 
TAATAATATATATTGACATTGGTAGTTCATTTC 

>G180 Amino Acid Sequence (domain in AA coordinates: 118-174) 

MNFLVPFEETlSr\nvrFFS 

VNDDQENFGGGTNNDAHSNSWWRSNSG 

YKWRKYGQKVVTOSrSLHPRSYYRCTHNNCRv^^ 

TSPDHDCLSSF* 

>G192 (63.. 959) 

CTTTTTTCTCTTCTCTCCTCAGAGATTCGAAGCTTTTTGTCTCCCCTGAGTAACCAAATT 
CAATGGCCGACGATTGGGATCTCCACGCCGTAGTCAGAGGCTGCTCAGCCGTAAGCTCAT 
CAGCTACTACCACCGTATATTCCCCCGGCGTTTCATCTCACACAAACCCTATATTCACCG 
TCGGACGACAAAGTAATGCCGTCTCCTTCGGAGAGATTCGAGATCTCTACACACCGTTCA 
CACAAGAATCTGTCGTCTCTTCGTTTTCTTGTATAAACTACCCAGAAGAACCTAGAAAGC 
CACAGAACCAGAAACGTCCTCTTTCTCTCTCTGCTTCTTCCGGTAGCGTCACTAGCAAAC 
CCAGTGGCTCCAATACCTCTAGATCTAAAAGAAGAAAGATACAGCATAAGAAAGTGTGCC 
ATGTAGCAGCAGAAGCTTTAAACTCCGATGTCTGGGCATGGCGAAAGTACGGACAGAAAC 
CCATCAAAGGTTCACCATATCCAAGAGGATACTACAGATGTAGTACATCAAAAGGTTGTT 
TAGCCCGTAAACAAGTGGAGCGAAATAGATCCGACCCGAAGATGTTTATCGTCACTTACA 
CGGCGGAGCATAATCATCCAGCTCCGACACACCGTAATTCTCTCGCCGGAAGCACACGTC 
AGAAACCATCCGATCAACAGACGAGTAAATCTCCGACGACCACTATTGCTACTTATTCAT 
CGTCTCCGGTGACTTCAGCCGACGAATTTGTTTTGCCTGTTGAGGATCATCTAGCGGTGG 

TCTTCGATGGGTTAGAGGAATTCGCAGCCGGAGATAGCTTTTCCGGGAACTCGGCTCCGG 
CGAGTTTTGATCTCTCTTGGGTTGTGAACAGTGCCGCCACTACCACCGGAGGAATATGAT 
TAGATTACGACGGCTTAGAATACTCTTATTAGGACAGATTTATAGGATTAAGGAATTATT 
CTCGGAGCATATGTAAAAATAGGATAAAAGAAAATGTTCTTTGTTACTTTTTTTCGGGTT 
TTCTTCCTATTGTTTCTAAACATCTTAGAAAAAATTTAATTGTATATTCCTTAAGCTCGA 
TACATCTTGTTTTAAAAAAAAAAAAAAAAAA 

>G192 Amino Acid Sequence (domain in AA coordinates: 128-185) 
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MADD WDLHAWRGC S AVS S S ATTTVYS PGVS SHTNP I FTVGRQSNAVS FGE I RDLYTPFT 

QESVVSSFSCINYPEEPRKPQNQKRPIiSLSASSGSVTSKPSGSNTSRSKRRKIQHKKVCH 

VAAEALNSDWAWRKYGQKPIKGSPYPRGYYRCSTSKGCLARKQVERNRSDPKMFIVTYT 

AEHNHPAPTHRNSLAGSTRQKPSDQQTSKSPTTTIATYSSSPVTSiU^EFVLPVEDHIiAVG 

DLDGEEDLLSLSDTWSDDFFDGLEEFAAGDSFSGNSAPASFDLSWWNSAATTTGGI* 

>G1948 (18.. 1118) 

AAAAGGTCTTCTTGGCCATGGATACTTGTGCTCTAGTAATCCATCAGTCTCTGTCTCGCA 
TCAAACTTTCTCCTCCCAAATCTTCTTCTTCTTCTTCTTCTGCTTTCTCCCCTGAATCCT 
TACCGATCAGACGGATCGAGCTGTGTTTCCGAGGAGCTATATGTGCCGCCGTACAAAGAA 
ACTACGAAGAAACGACCTCCTCCGTGGAAGAGGCAGAGGAAGATGATGAGTCATCATCAT 
CGTACGGAGAAGTGAACT^AGATCATTGGAAGCCGAACGGCGGGGGAAGGAGCCATGGAGT 
ACCTTATCGAGTGGAAGGACGGCCATTCTCCGTCGTGGGTTCCATCGAGCTACATCGCAG 
CAGACGTAGTGTCGGAGTACGAGACACCCTGGTGGACGGCAGCTAGAAAAGCCGACGAGC 
AGGCCCTGTCACAGCTCCTGGAGGACCGAGACGTCGATGCCGTGGACGAAAACGGCCGGA 
CGGCTCTGCTTTTCGTGGCAGGTCTGGGGTCGGACAAGTGCGTAAGGCTTCTGGCGGAGG 
CTGGAGCCGATCTCGACCACCGAGACATGAGGGGAGGCTTGACGGCGCTGCACATGGCGG 
CTGGTTACGTGAGGCCGGAGGTGGTGGAGGCGCTGGTGGAGCTGGGAGCTGATATTGAAG 
TGGAAGACGAGAGAGGGTTAACGGCGTTGGAACTAGCGAGGGAGATTCTGAAGACGACGC 
CGAAGGGGAATCCGATGCAGTTCGGGAGGAGAATTGGGTTAGAGAAAGTGATCAATGTCC 
TGGAAGGACAAGTGTTCGAGTACGCCGAGGTGGATGAGATCGTAGAGAAACGAGGGAAAG 
GCAAAGACGTTGAATATCTGGTCAGATGGAAGGACGGTGGAGATTGCGAGTGGGTGAAAG 
GTGTACACGTGGCGGAAGATGTGGCTAAGGACTACGAGGATGGGCTGGAGTACGCTGTAG 
CGGAGAGTGTGATCGGGAAGAGGGTGGGAGACGATGGGAAGACCATCGAGTATCTTGTCA 
AATGGACTGATATGTCTGATGCCACTTGGGAGCCTCAGGACAATGTCGACTCTACTCTTG 
TTCTACTCTACCAACAACAACAACCAATGAATGAATGATTGATTTTGATGATTACATTCT 
TCTCAATTTGCTTCTTTCTCATATGTGTTGGTTCATCTGACCGGTTCGGTTGGTACGTAC 
CGGTACATTTTCATTTTCTTTTAAGATGTGATCCT 

CTATTTGATTTTATATCCATGCTTTGAATTTTGCTTCCCTTTTTGGGGAGATTCATGAAA 

>G1948 Amino Acid Sequence (domain in AA coordinates: entire protern) 

MDTCALVIHQSIjSRI kls ppks s s s s s s AFS PE S LP IRR I ELCFRGAI CAAVQRNYEETT 

SSVEEAEEDDESSSSYGEVNKJIGSRTAGEGAMEYLIEWKDGHSPSWVPSSYIAADVVSE 

YETPWWTAARKADEQAIiSQLLEDRDVDAVDEN^ 

HRDMRGGLTALHMAAGYVRPE VVEALVELGADI E VEDERGLTAXiELARE I LKTTPKGNPM 
QFGRRIGLEKVINVLEGQVFEYAEVDE I VE KRGKGKDVEYLVRWKDGGDCE WVKGVHVAE 
DVAKDYEDGLEYAVAES VUGKRVGDDGKTI EYLVKWTDMSDATWEPQDNVD STIiVLL YQQ 
QQPMNE* 

>G2123 (1..657) 

ATGAGAAAAGTATGTGAGCTTGATATAGAGCTAAGTGAAGAGGAAAGAGACCTACTAACA 

ACTGGATACAAGAATGTCATGGAGGCTAAGAGAGTTTCATTGAGAGTAATATCATCCATT 

GAAAAAATGGAAGACTCGAAAGGAAACGACCAAAATGTGAAACTGATAAAAGGACAACAA 

GAAATGGTTAAATATGAGTTTTTCAATGTTTGTAATGACATTTTGTCTCTCATTGAT^ 

CATCTCATACCATCAACTACTACTAATGTCGAATCAATTGTCCTTTTTAACAGAGTGAAA 

GGAGATTATTTTCGATATATGGCAGAGTTTGGTTCTGATGCTGAACGTAAAGAAAATGCA 

GATAATTCTCTAGATGCATATAAGGTTGCAATGGAAATGGCAGAGAATAGTTTAGCACCC 

ACGAATATGGTTAGACTTGGATTGGCTTTAAATTTCTCGATATTCAATTATGAGATCCAT 

AAATCTATTGAAAGCGCATGTAAATTGGTTAAGAAAGCTTACGATGAAGCAATCACTGAA 

CTCGATGGCCTTGACAAGAATATATGCGAAGAGAGCATGTATATCATAGAGATGCTTAAA 

TACAATCTTTCTACGTGGACTTCAGGCGATGGTAATGGTAATAAGACAGACGGTTAG 

>G2123 Amino Acid Sequence (domain in AA coordinates : 99-109) 

MRKVCELD I ELSEEERDLLTTGYKNVME AKRVS LRVI SSI EKMED SKGNDQNVKL I KGQQ 

EMVKYEFFNVCNDILSLIDSHIiIPSTTTNV^SIVLFNRVKGDYFRYMAEF^ 

DNS LDAYKVAMEMAENSLAPTNMVRLGLALNFS I FNYE IHKS I E SACKLVKKAYDEAITE 

LDGLDKNT CEE SMY I IEMLKYKTLSTWTSGDGNGNKTDG* 

>G2138 (27.. 512) 

GGAACCCTAATTTCCGCAAATTCACTATGAAGCGTATTATCAGAATCTCATTCACCGACG 
CAGAAGCCACCGATTCTTCTAGCGACGAAGACACGGAGGAGCGTGGAGGAGCATCCCAGA 
CTCGGCGCCGTGGGAAACGCCTCGTTAAAGAGATCGTAATCGATCCTTCCGATTCCGCCG 
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ATAAACTCGATGTC'f'GCAAAACACGGTTCAAAATCAGGATCCCGGCGGAATTTCTCAAGA 

CGGCGAAAACGGAGAAGAAATATCGTGGAGTGAGGCAGAGGCCGTGGGGGAAGTGGGTGG 

CGGAGATCAGATGTGGAAGAGGAGCTTGTAAAGGACGACGTGATCGTCTCTGGCTGGGTA 

CTTTTAACACTGCTGAGGAAGCTGCTCTAGCTTATGATAACGCTTCAATTAAGCTGATTG 

GACCTCACGCGCCGACCAATTTTGGTTTGCCGGCGGAGAATCAAGAGGATAAGACGGTGA 

TTGGAGCTTCTGAGGTTGCTAGAGGCGCGTGAAGTGGGGTTGGTAATTTAGTTGTTAGC 

>G2138 Amino Acid Sequence (domain in AA coordinates: TBD) 

MKRIIRISFTDAEATDSSSDEDTEERGGASQTRRRGKRLVKEIVIDPSDSADKLDVCKTR 

FKIRIPAEFLKTAKTEKKYRGVRQRPWGKWVAEIRCGRGACKGRIUDRLWLGTFNTAEEAA 

LAYDNAS I KL IG PHAPTNFGLPAENQEDKTVI GASEVARGA* 

>G2139 (40.. 663) 

CCTACAAGAAATCAAACACTAGTTCTGGTTTCTGCAAACATGTCATCTACGAAGCAAGCA 
AAGGGAAGAAAAACAAAGGGGAAGCAAAAGATCGAGATGAAGAAGGTGGAGAACTATGGA 
GATAGGATGATTACGTTCTCAAAACGTAAAACCGGAATTTTTAAGAAAATGAACGAGCTC 
GTAGCAATGTGTGACGTTGAAGTGGCTTTCTTGATTTTCTCTGAACCCAAGAAGCCCTAT 
ACATTCGCACATCCGTCTATGAAGAAAGTGGCTGACCGGTTAAAGAACCCTTCGAGACAA 
GAACCATTAGAGAGAGACGATACCAGACCCCTCGTCGAAGCTTATAAGAAACGAAGGCTC 
CACGACCTCGTAAAAAAAATGGAGGCGCTCGAAGAGGAGCTTGCGATGGATCTAGAGAAG 
TTGAAACTGTTGAAGGAATCGAGAAATGAAAAGAAGTTAGATAAAATGTGGTGGAACTTT 
CCTTCGGAAGGTTTGAGCGCGAAGGAGCTGCAGCAAAGGTACCAAGCGATGCTCGAGTTA 
CGTGATAACTTATGCGACAATATGGCTCACTTACGATTGGGAAAAGACTGTGGTGGTTCA 
TCTTCTGTTCGTGTGGGACGTCGAGTTTCTGGTGGTGTTCGTCTGTTCGATCGTGAAGCA 
TGATCATACATATTCATACTTGATO^ 

ATACTGCATGTATCCATTTGACGAAGCTCAATCGTCTCGAGTATATCTCTATTATCTAAC 
AGTATTGAGAAAAAAGGAGTTTCAGTAAAAAAAAAAAAAAAAAAAAAA 

>G2139 Amino Acid Sequence (conserved domain in AA coordinates : 14-69) 

MS STKQAKGRKTKGKQKI EMKKVENYGDRM ITFS KRKTG I FKKMNELVAMCDVEVAFL I F 

SQPKKPYTFAHPSMKKVADRLKlJPSRQEPIiERDDTRPIiVEAYKKRRLHDL 

LAMDLEKLKLLKESRNEKKLDKMWWNFPSEGLSAKELQQRYQA^ 

GKD CGGS S S VRVGRRVS GG VRLFDRE A* 

>G2343 (1. .1113) 

ATGGGTCATCACTCATGCTGCAACCAGCAAAAGGTGAAGAGAGGGCTTTGGTCACCGGAA 

GAAGATGAGAAGCTTATTAGATATATCACAACTCATGGCTATGGATGTTGGAGTGAAGTC 

CCTGAAAAAGCAGGGCTTCAAAGATGTGGAAAAAGTTGTAGATTGCGATGGATAAACTAT 

CTTCGACCTGATATCAGGAGAGGAAGGTTCTCTCCAGAAGAAGAGAAATTGATCATAAGC 

CTTCATGGAGTTGTGGGAAACAGGTGGGCTCATATAGCTAGTCATTTACCGGGAAGAACA 

GATAACGAGATTAAAAACTATTGGAATTCATGGATTAAGAAAAAGATACGAAAACCGCAC 

CATCATTACAGTCGTCATCAACCGTCAGTAACTACTGTGACATTGAATGCGGACACTACA 

TCGATTGCCACTACCATCGAGGCCTCTACCACCACAACATCGACTATCGATAACTTACAT 

TTTGACGGTTTCACTGATTCTCCTAACCAATTAAATTTGACCAATGATCAAGAA 

ATAAAGATTCAAGAAACTTTTTTCTCCCATAAACCTCCTCTCTTCATGGTAGACACAACA 

CTTCCTATCCTAGAAGGAATGTTCTCTGAAAACATCATC^ 

GATCATGATGACACGCAAAGAGGAGGAAGAGAAAATGTTTGTGAACAAGCATTTCTAACA 
ACTAACACGGAAGAATGGGATATGAATCTTCGTCAGCAAGAGCCGTTTCAAGTTCCTACA 
CTGGCGTCACATGTGTTCAACAACTCTTCCAATTCAAATATTGACACGGTTATAAGTTAT 
AATCTACCGGCGCTAATAGAGGGAAATGTCGATAACATCGTCCATAATGAAAACAGCAAT 
GTCCAAGATGGAGAAATGGCGTCCACATTCGAATGTTTAAAGAGGCAAGAACTAAGCTAT 
GATCAATGGGACGATTCACAAC^^TGCTCTAACTTTTTCTTTTGGGACAACCTTAATATA 
AACGTGGAAGGTTCATCTCTTGTTGGAAACCAAGACCCATCAATGAATTTGGGATCATCT 
GCCTTATCTTCTTCTTTCCCTTCTTCGTTTTAA 

>G2343 Amino Acid Sequence (domain in AA coordinates: 14-116) 

MGHHSCCNQQKVKRGLWSPEEDEKLIRYITTHGYGCWSEVPEICAGLQRCGKSCRLRWINY 

LRPDIRRGRFSPEEEKLIISLHGWGNRWAHIASHLPGRTDNEIKITYWNSWIKKKIRKPH 

HHYSRHQPS VTTVTLNADTTS IATTIEASTTTTSTIDl^HFDGFTDSPNQI^FTlJTDQETO 

I KI QETFFSHKPPLFMVDTTLP ILEGMFSENI ITNNNKNNDHDDTQRGGRENVCEQAFLT 

TNTEEWDMNLRQQEPFQVPTLASHVFl^SSNSNI^ 

VQDGEMASTFECLKRQELSYDQWDDSQQCSNFFFWDNI^ 
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ALSSSFPSSF* 
>G265 (280.. 1317) 

CTTTGGTCTTGGAAGCCAAATCAAACCTTTCCTTCAATCCTCAAATTTTCGAAAATTTTC 

TCTTTTGCTTTACGTTCTCTCAATTCTTATTTGTAAGAAAGTTTGTTCCTTTAATCAATC 

AAATCAAAGAGACTTTTGAAGATTGTTTCCCAATTTGCGTCAATCGGGATCGAGTCAAAT 

CTGAAATCTTCTCCACTCATCATCTGACTATAAGACTTAATCAAGGGACTTTTTGTTCGG 

GTTTGGTTTTAAACGTCTTGGATTCGAAGTGGTTAAGGTATGGATGAAAATAATGGAGGT 

TCAAGCTCACTTCCACCTTTCCTTACTAAAACATATGAAATGGTTGATGATTCTTCTTCT 

GACTCGGTCGTTGCTTGGAGCGAAAACAACAAAAGCTTCATCGTCAAGAATCCAGCAGAG 

TTTTCAAGAGACCTTCTTCCGAGATTCTTCAAGCATAAGAATTTCTCAAGT 

CAGCTTAATACATATGGTTTTCGAAAAGTAGATCCTGAGAAATGGGAATTCTTGAATGAT 

GATTTTGTTAGAGGTCGACCTTACCTTATGAAGAACATTCATAGACGAAAACCGGTTCAT 

AGCCACTCGTTAGTGAATCTACAAGCGCAAAATCCTTTGACGGAATCAGAAAGACGGAGC 

ATGGAGGATCAGATAGAAAGACTGAAAAATGAGAAAGAAGGCCTTCTTGCGGAGTTACAG 

AACCAAGAGCAAGAACGGAAAGAGTTTGAGCTGCAAGTAACGACATTGAAAGATCGGTTA 

CAACATATGGAACAACATCAGAAATCAAT AGTGG CATATGTTTC AC AGGTTTTGG GAAAA 

CC^GGACTTTCACTAAACCTCGAAAACC^TGAGAGAAGAAAAAGAAGATTTCAAGAGAAC 

TCTCTTCCTCGAAGCAGTTCACACATAGAACAGGTCGAAAAGTTAGAATCTTCGCTAACG 

TTTTGGGAGAATCTTGTATCGGAATCATGCGAGAAGAGCGGTTTGCAGTCATCAAGCATG 

GATCATGATGCAGCTGAGTCAAGTCTAAGTATTGGCGATACACGACCCAAATCATCGAAG 

ATTGATATGAACTCAGAGCCGCCCGTTACCGTTACTGCGCCTGCTCGAAAAACAGGCGTT 

AACGATGACTTTTGGGAACAATGTTTGACAGAGAACCCTGGATGAACCGAGCAACAAGAA 

GTTCAGTCAGAGAGAAGAGATGTCGGTAATGATAATAATGGTAATAAGATTGGAAATCAA 

AGGACGTATTGGTGGAATTCAGGGAATGTAAATAACATTACAGAGAAAGCTTCTTGACAT 

GAATGAGGTTTTTGTAAAATAGTTTTCTTTTGGTTCCACTGAGATTATTGTATGTGTTCA 

TTATTTATTACTCTGTTTCTGTAAAAACAAATCTCTCTATTGTTTGAGGCAGGAGTGACA 

TAAATGCATATGCAGAATTGGTTTCAAAAA 

>G265 Amino Acid Sequence (domain in AA coordinates: 11-105) 
MDEl^GGSSSLPPFLTKTYEMVDDSSSDSW^ 
NFSSFIRQLlTrYGFRKVDPEKWEFLNDDFVRGRPYLMKNI 
TESERRSMEDQIERLKNEKEGLLAELQNQEQERKEFELQV^ 

VS QVLGKPGLSLNIiENHERRKRRFQENSLPP S S SHI EQVEKLES SLTFWENLVSES CEKS 
GLQS S SMDHDAAES SLS IGDTRPKSS KIDMNSEP PVTVTAPAPKTGVNDDFWEQCLTENP 
GSTEQQEVQSERRDVGNDNNGNKIGNQRTYWWNS GNVlsINI TEKAS * 
>G2792 (1. .960) 

ATGGATCATCATCATCACATAGCATCAAGAAATTCATC7UVCAACATCAGAATTACGATCA 
TTCGAGCCAGCGTGCCATAACGGTAATGGTAACGGTTGGATCTATGACCCAAATCAAGTT 
AGGTACGATCAAAGTAGTGACCAACGGCTGTCAAAGTTGACGGATCTTGTAGGCAAGCAC 
TGGTCAATTGCACCACCGAATAATCCCGACATGAACCATAACCTTCATCATCACTTCGAT 
CATGATCATTCTCAAAACGACGACATTTCTATGTACAGACAAGCCTTGGAGGTGAAAAAT 
GAGGAAGATCTTTGTTACAATAATGGCTCAAGTGGTGGTGGTTCCTTGTTCCATGATCCT 
ATAGAAAGTTCTAGAAGTTTCCTTGATATAAGGTTAAGTAGGCCATTAACGGATATTAAT 
CCGTCATTTAAGCCATGCTTTAAGGCCTTAAACGTATCCGAGTTTAACAAGAAAGAACAT 
CAAACGGCATCTCTGGCAGCAGTGAGACTGGGAACAACAAACGCTGGAAAAAAGAAGAGA 
TGTGAAGAAATTTCCGATGAGGTTTCAAAGAAGGCCAAGTGCAGTGAGGGCTCTACACTT 
TCGCCAGAGAAGGAACTACCGAAAGCCAAACTTCGAGACAAGATCACGACTCTACAGCAA 
AOTGTGTCTCCCTTTGGAAAGACTGATACTGCTTCTGTGCTTCAAGAGGCCATCACTTAC 
ATAAATTTTTATCAAGAGGAAGTTAAGCTGCTAAGCA 

ATGAAGGATCCATGGGGGGGATGGGACAGAGAAGATCACAACAAAAGGGGACCGAAGCAT 
CTTGATCTAAGGAGTAGAGGGCTTTGTTTGGTTCCTATTTCATATACCCCAATCGCATAC 
CGCGATAACAGTGCAACTGACTACTGGAATCCCACGTATAGAGGTTCTTTGTATCGTTAG 
>G2792 Amino Acid Sequence (domain in AA coordinates : 190-258) 
MDHHHHIASI^SSTTSELPSFEPACHNGNGNGWIYDPNQVRYDQSSDQRIiSKLTDLVG^ 
WS IAPPNNPDMNHNIiHHHFDHDHSQiTODISMyRQALEVKNEEDLCY^ 
IESSRSFLDIRLSRPLTDINPSFKPCFKALNVSEFNKKEHQTASIxAAVRLGTTNAGKKK^ 
CEEISDEVSKKAKCSEGSTLSPEKEIiPKAKLRD 

INFYQEQVKLLSTPYMKNSSMKDPWGGWDREDHNKRGPKHLDLRSRGIiCLVPISYTPIAY 



220 



WO 03/013227 PCT/US02/25805 

221/286 



RDNSATDYWNPTYRGSLYR* 
>G2830 (1..903) 

ATGTCTTCCATCCCAAATAGGTTCAATATTTATGGTGGTGATACCACAAACCATCGTGAA 
TCGCTTCCCATCGAAATGAATCACAACTCTCGAATGGTTCGATCCATGTTCATTACATCT 
GATCGCATGAATCATAGAGATTTGTTTTCTTCTCCTCCTTCTTTCTCTTCTTATCAAAAT 
TCACATATCTCTTCATCTTCTGTTGGGTTTAATAATTCACATATGACTTATCATATGCTG 
AAAAGAAATTATGATTCTGTTTCCCGTGCTGATTATTTCTCTACTAAAGATCATTCTCAT 
TTTACTC^GTATCTTTCACTCAAACCATCACAAATAAGTATACTACTATTGTTCCTTCC 
AATATATTTGACACTGTTCACTATGATATTGGTCGTGTCAAACGTGCCATAGATTTTAGA 
AATATTTGGAATCCTAAATCTCATCTTCGAAAAAAATTTAATAGGCAATGCGAGATTTTG 
AATCCTACCCCTCTTAATATCGTCTTTCCGCACCAGGATTCAGCTGATCGTCAACATTTA 
GACATTATTTTCTCGTCATCAAAGCACAACGA^ 

AAAATTTCCGAACCAACCAATCTGTTTGAAAAATCTAATTCTTATGATTCTCAAGAAGAT 
GAGAAAATCGATGCTTATCAATATGATGGTCGTACACATAGTCTACCGTATACGAAATAC 
GGTCCATATACATGTCCCAGGTGTAACGGTGTGTTTGATACTTCTCAAAAATTTGCTGCA 
CATATGTTATCTCACTACAATAATGAGACGGACAAAGAAAGAGACCAAAGATTTCGTGCA 
AGAAATAAAAAACGATATCGTAAGTTTATGGAC^GTCTTAAAATATCAAAACAGAAGATA 
TGA 

>G2 830 Amino Acid Sequence (domain in AA coordinates : 245-266) 
MSSIPNRFNIYGGDTTEIHRESLPIEMNHNSRMVRSMFITSDR 

SH I S S S S VGFNNSHMT YHMLKRNYD S VSRAD YFSTKDHSHFTQVS FTQT I TNKYTT I VP S 
NIFDTVHYDIGRVKRAIDFRNIWNPKS 

DIIFSSSKHNHVFQDGRSLKKISEPTNLFEKSNSYDSQEDEKIDAYQYDGRTHSIjPYTKy 

GPYTCPRCNGVFDTSQKFAAHMLSHYNNETDKERDQRFI^ 

* 

>G286 (94.. 2454) 

TGCAATTTCTCTCGACCAAAACCCTAATTTCAGGTTTGGGGTTTTCCTTCTTTCACTGTC 

AATTTTGATGAAACTTGTGATTCAGTGATTAGAATGAATGCTAATGAGCAAACTCGATCC 

GCCAATGGCATTGGCAATGGCAATGGTGAGTCTATTCCCGGGATTCCAGATGACTTACGG 

TGCAAGAGATCGGATGGTAAACAGTGGAGATGCACTGCAATGTCCATGGCTGATAAGACT 

GTTTGTGAGAAGCACTACATCCAAGCAAAGAAGCGGGCGGCTAATTCTGCTTTCAGGGCG 

AACCAGAAGAAAGCGAAAAGGCGATCATCGTTAGGCGAAACAGATACGTATTCGGAAGGG 

AAGATGGATGATTTCGAGTTACCAGTCACCAGCATTGACCACTATAATAACGGTCTTGCC 

TCTGCTTCCAAGAGTAATGGTAGACTAGAGAAGAGACATAATAAAAGCCTGATGCGGTAC 

TCGCCCGAGACACCGATGATGAGGAGTTTCTCTCCACGTGTTGCAGTGGATTTGAATGAT 

GACTTGGGTAGAGATGTTGTAATGTTTGAAGAGGGCTACAGATCTTATAGGACACCACCA 

TCTGTTGCTGTTATGGATCCGACACGAAAC^^ 

TACTCAGCAGCAAGCACAGATGTGTCTGCAGAGTCTTT 

CAGAGAAAAGATAGAGAGAGAATCATTTCTTGCCTCAAATGCAATCAAAGAGCCTTCTGC 
CACAATTGTCTATCGGCAAGGTACTCGGAGATATCACTTGAAGAAGTCGAGAAAGTTTGC 
CCTGCATGTCGTGGCTTGTGTGATTGCAAATCTTGCCTGCGTTCAGATAATACAATAAAG 
GTTCGGATCCGGGAAATACCCGTTTTGGACAAGTTGCAGTATCTTTATCGTCTATTATCA 
GCTGTCCTACCAGTCATAAAGCAGATCCATCTTGAACAATGTATGGAAGTTGAACTAGAG 
AAGAGGCTTCTTGAAGTTGAGATTGATCTTGTCAGGGCAAGATTGAAAGCAGATGAGCAG 
ATGTGCTGCAACGTGTGTCGGATACCAGTTGTTGACTACTACCGTCACTGTCCGAACTGC 
TCATATGACCTTTGCCTGAGATGCTGTCAAGATCTACGGGAAGAGTCTTCAGTGACGATT 
AGTGGGACTAACCAAAACGTACAAGATAGAAAAGGAGCTCCCy^ACTAAAACTAAACTTT 
TCATACAAGTTTCGTGAGTGGGAAGCCAACGGTGATGGGAGCATCCCTTGCCCTCCTAAG 
GAGTATGGAGGCTGCGGTTCACATTCTTTGAATCTTGCCCGCATTTTCAAGATGAATTGG 
GTTGCAAAGCTTGTGAAAAATGCTGAGGAGATTGTTAGTGGCTGCAAATTATCTGATCTT 
CTGAACCCTGATATGTGTGATTCAAGATTCTGCAAATTTGCTGAGAGAGAAGAGAGCGGT 
GACAACTACGTGTACAGCCCGTCGCTTGAAACGATTAAAACTGATGGAGTAGCTAAGTTT 
GAGCAACAATGGGCAGAGGGTCGGCTTGTTACTGTGAAAATGGTACTTGATGACTCATCT 
TGCTCTAGATGGGATCCTGAGACTATTTGGAGGGATATAGACGAGCTTTCGGACGAGAAA 
CTGAGAGAACATGATCCATTCTTGAAGGCCATTAATTGCTTGGATGGTTTAGAGGTTGAT 
GTAAGACTTGGGGAGTTTACAAGAGCATATAAAGATGGAAAGAACCAAGAGACAGGTCTT 
CCGCTATTGTGGAAGTTAAAGGACTGGCCGAGCCCAAGTGCTTCCGAGGAGTTCATTTTC 
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TACCAAAGACCTGAGTTTATCAGAAGTTTTCCGTTTCTCGAGTACATTCATCCCCGGTTA 
GGCCTTCTGAATGTTGCAGCCAAGTTACCTCATTACTCGCTCCAAAACGATTCAGGTCCA 
AAGATTTATGTGTCTTGTGGGACGTACCAAGAAATCAGTGCTGGCGATTCATTGACTGGT 
ATTCACTACAACATGCGTGACATGGTATACCTATTGGTGCACACGTCTGAAGAAACAACA 
TTCGAAAGGGTGAGAAAAACAAAACCTGTTCCAGAGGAACCTGACCAGAAGATGAGCGAA 
AATGAGTCACTTCTTAGCCCTGAGCAGAAATTAAGGGACGGAGAGTTACATGATCTATCA 
CTTGGTGAAGCCAGTATGGAGAAGAATGAACCTGAGTTGGCGTTGACTGTGAATCCAGAG 
AACTTAACGGAAAACGGTGACAACATGGAATCTTCTTGCACATCTTCATGTGCAGGAGGA 
GCCCAGTGGGATGTCTTTCGACGCCAAGACGTCCCAAAGTTGTCCGGGTATTTGCAGAGA 
ACATTCCAGAAGCCTGATAATATCCAGACTGATTTTGTAAGCCGTACCTGCTAATTCAAA 
TAAATGAAGTGTGTAAAGTCTTGTATGTGGAATGATTGAGTTTCCTAGTTTGTTTACTCT 
GGTTTCAGGTGTCACGCCCGTTGTATGAAGGATTGTCTTTAAATGAACACCACAAGAGAC 
AACTAAGAGACGAGTTTGGAGTTGAGCCATGGACATTTGAGCAACATCGTGGTGAGGCTA 
TCTTCATTCCGGCTGGATGTCCGTTCCAAATCACTAATCTTCAGTCGAATATTCAGGTGG 
CACTTGACTTCTTGTGCCCTGAAAGCGTTGGAGAGTCAGCAAGACTAGCTGAAGAAATCC 
GGTGTTTACCAAACGACCACGAGGCAAAACTTCAGATTCTAGAGATTGGAAAGATATCAT 
TATACGCAGCTAGCTCA'GCCATTAAAGAGGTTCAGAAACTGGTCTTGGATCCAAAGTTTG 
GAGGAGAGCTTGGATTTGAAGACTCTAACTTAACCAAAGCAGTCTCTCACAACTTAGACG 
AGGCAACCAAGCGGCC 

>G286 Amino Acid Sequence (domain in AA coordinates: TBD) 
MNANEQTRSANGIGNGNGES IPGI PDDIjRCKRSDGKQWRCTAMSMADKTVCEKHYI QAKK 
RAANSAFRANQKKAKRRSSLGETDTYSEGKMDDFELPVTS IDHYNWGI»ASAS KSNGRLEK 
RHNKSIiMRYSPETPMMR^FSPRVAVDIiNDDIiGR^ 

SHQSTSPMEYSAASTDVSAESLGEICHQCQRKDRERIISCLKCNQRAFCHNCIjSARYSEI 

SLEEVEKVCPACRGLCDCKSCLRSDOTIKVRIREIPVLDKLQYLYR 

EQCMEVELEKRLIjEVEIDLVRARLKADEQMCCNVCRIPVVDYYRHCPNCSTO 

LREESSVTISGTNQNVQDRKGAPKLKLiNFSYKFPEWEANGDGSIPCPPKEYGGCGSHSLN 

LARIFKMNWAKLVKNAEEIVSGCKL^ 

IKTDGVAKFEQQWAEGRLVTVKMVLDDSSCSRWDPETIWRDIDELSDEKLR 
NCLDGLEVDVRLGEFTRAYKDGKNQETGLPLLWKLKDWPSPSASEEFIFYQRPEFIRSFP 
FLEYIHPRLGLlLNVAAKLPHYSLQlSroSGPK^ 
LVHTSEETTFERVRKTKPVPEEPDQKMSENESLLSPE 

ELJUjTVNPENLTENGDNMES s cts s C AGG AQWDVFRRQDVPKLSGYLQRTFQKPDN I QTD 
FVSRTC* 

>G291 (124. .1197) 

CAAGL^CCGAAAGATCTCTCTCTATTTGTTTGCCTTCTTCTTTCTTTCTGACTCAAACCC 
TCAAATCAATTCTCGCGATTAAGCAAAACCCTAGATTTATTCTACTCTTCGAAGTCGATT 
TCAATGGAAGGTTCCTCGTCAGCCATCGCGAGGAAGACATGGGAGCTAGAGAACAACATT 
CTCCCAGTGGAACCAACCGATTCAGCCTCCGACAGTATATTCCACTACGACGACGCTTCA 
CAAGCCAAAATCCAGCAGGAGAAGCCATGGGCCTCCGATCCTAACTACTTCAAGCGCGTT 
CACATCTCAGCCCTTGCTCTTCTCAAGATGGTGGTTCACGCTCGCTCCGGTGGCACAATC 
GAGATCATGGGTCTTATGCAGGGTAAAACCGAGGGTGATACAATCATCGTTATGGATGCT 
TTTGCTTTGCCTGTTGAAGGTACTGAGACTAGGGTTAATGCTCAGTCTGATGCCTATGAG 
TATATGGTTGAATACTCTCAGACCA<3CAAGCTGGCTGGGAGGTTGGAGAACGTTGTTGGA 
TGGTATCACTCTCACCCTGGGTATGGATGTTGGCTCTCGGGTATTGATGTTTCGACACAG 
ATGCTTAACCAAGAGTATCAGGAGCCATTCTTAGCTGTTGTTATTGATCCAACAAGGACT 
GTTTCGGCTGGTAAGGTTGAGATTGGGGCATTCAGAACATATCCAGAGGGACATAAGATC 
TCGGATGATCATGTTTCTGAGTATCAGACTATCCCTCTTAACAAGATTGAGGACTTTGGT 
GTACATTGCAAACAGTACTACTGATTGGACATG^ 

CACCTTCTGGATCTCCTTTGGAACAAGTACTGGGTGAACACTCTTTCTTCTTCCCCACTG 
TTGGGCAATGGAGACTATGTTGCCGGGCAAATATCAGACTTGGCTGAGAAGCTCGAGCAA 
GCGGAGAGTGAGCTCGCTAACTCCCGGTATGGAGGAATTGCGCCAGCCGGTCACCAAAGG 
AGGAAAGAGGATGAGCCTCAACTCGCGAAGATAACTCGGGATAGTGCAAAGATAACTGTC 
GAGGAGGTCCATGGACTAATGTCACAGGTTATGAAAGACA^ 

CAGTCCAAGAAGTCTGCTGACGACTCATCAGATCCAGAGCCGATGATTACATCGTGAAGT 
TGGTCTATTCTTTTGTTTTTTGGCTGCGGAAATTGACTATCGGTTTGACCCGGTTTATGA 
GGCAATGCCCATTGTTCCCTATATCTCTAGTGTAGTATCTGCTTCAGACAAAGATCTTTG 



222 



WO 03/013227 PCT/US02/25805 

223/286 



GGTTATTAAATGACATTAACATAAAAAAAA 

>G291 Amino Acid Sequence (domain in AA coordinates: 132-160) 
MEGSSSAIARKTWELEl^ILPVEPTDSASDSIFHYDDASQAKIQQEKPWASDPNYFKRVH 
ISALALLKIWVHARSGGTIEIMGLMQGKTEGDTIIVT4DAFALPVEGTETRW 
MVEYSQTSKXjAGRLENVVGWyHSHPGYGCWLSGIDySTQMLNQQYQEPFLAVVIDPTRW 
SAGKVEIGAFRTYPEGHKISDDHVSEYQTIPLNKIEDFGVHCKQYYSIiDITYFKSSLDSH 
LLDLLWNKYWVNTLSSS PLLGNGDYVAGQI SDLAEKLEQAESQLANSRYGGI APAGHQRR 
KEDEPQIjAKITRDSAKITVEQVHGIiMSQVIKDILFNSARQSKKSADDSSDPEPMITS* 
>G427 (49.. 1230) 

TTTCCCTCTCCGAAACAGAAATTCAAAAACAAATTC^ 

AACAATCACTTTAATCATTTCACCGACCAAG^ 

CAGC^GC^GCAACAACATTTTCAAGAATC^^ 

AACAACTTCCTCAATCTCCACACAGCTGCCACAGCCGCCGCTACAAGCTCCGATTCTCCT 

TCTTCCGCCGCCGCTAACCAGTGGCTCTCACGATCCTCATCCTTCCTCCAACGAGGCAAC 

ACCGCAAACAACAACAACAACGAAACATCCGGTGACGTCATCGAAGACGTTCCCGGCGGA 

GAGGAGTCAATGATCGGAGAGAAGAAGGAGGCGGAGAGGTGGCAGAATGCGAGACACAAG 

GCGGAGATACTGTCTCATCCACTATACGAGCAACTTTTGTCGGCACACGTGGCGTGCCTG 

AGGATCGCAACGCCGGTGGATCAGCTTCCGAGGATAGACGCACAGCTTGCTCAGTCTCAA 

AACGTCGTGGCTAAGTACTCAACTTTAGAAGCCGCTCAAGGACTCCTCGCCGGCGATGAC 

AAGGAGCTTGACC^CTTCATGACGC^TTATGTACTATTGCTTTGCTCTTTCAAAGAAa^ 

CTGCAACAGCATGTTCGTGTTCATGCAATGGAAGCTGTTATGGCCTGTTGGGAGATTGAA 

CAGTCGCTTCAAAGTTTTACAGGAGTATCTCCTGGTGAAGGCACAGGAGCAACAATGTCT 

GAGGATGAAGATGAGCAAGTAGAGAGTGATGCTCATTTGTTTGATGGAAGCTTAGATGGG 

TTAGGGTTTGGTCCTCTAGTTCCCACTGAGAGCGAGAGATCTTTGATGGAACGAGTCAGA 

CAAGAACTCAAACATGAACTCAAGCAGGGTTACAAGGAGAAAATTGTGGACATAAGAG^ 

GAGATACTGAGGAAGAGAAGAGCTGGAAAATTACCAGGAGACACCACCTCTGTTCTCAAA 

TCATGGTGGCAATCTCATTCTAAGTGGCCTTACCCTACTGAGGAAGATAAGGCGAGGTTG 

GTG(^GGAGACGGGTTTGCAGCTCAAACAGATAAACAAT 

AGGAATTGGCATAGCAATCCATCTTCTTCTACCGTCTCAAAGAATAAACGCCGAAGCAAT 
GCAGGTGAAAACAGCGGAAGAGACCGTTGAGAT(^UVGCTTGCATGTAGAGATCCAAAAGC 
TTTATAGAAAGGTGGAGGCATGAAGACAAAGAATTCTTACACAACAAACGTAGGACGTAA 
TTTTGTGCCAGTACATGGTATGGCTTTCATATTTGGTAATGATTAGGGCCACACAAAATT 
AAACCCCAAAGCATGATTTGTAATATGAGGTTTTAGATGGACTTTATGATAGGATCGTCA 
GTCTTC^CTGCCATCTCCATTCTCCACC^TC^TCCATCATTATATCTTGTGAAAAAAAA 
A 

>G427 Amino Acid Sequence (domain in AA coordinates: 307-370) 

MAFHNNHFl^FTDQQQHQPPPPPQQQQQQHFQESAPPl^^ 

SDSPSSAAANQWLSRSSSFLQRGNTANNNl^ 

ARHKAEILSHPLYEQLIiSAHVACLRIATPVDQLPRIDAQLAQSQNVVAKYSTLEAAQGIjL 

AGDDKELDHFMTHYVIjIjIiCS fkeqlqqhvrvhame avmacwe i eqslqs ftgvs PGEGTG 

ATMSEDEDEQVESDAHLFDGSLDGLGFGPLVPTESERSIiMERVRQEIiKHELKQGYKEKIV 

DIREEILRKRRAGKIiPGDTTSVIiKSWWQSHSKWPYPTEEDK^ 

NQRKRNWHSNPSSSTVSKNKRRSNAGENSGRDR* 

>G509 (122.. 1054) 

CTTCCTCCTTTGCTAATAAACTTTTCTTTGAACCTTACACGCCTTGTTGATATTACTCTC 
TTAAATATATATTTTCGTACATTAACACAGACATATATAAAGCTAAAGATTTCTTCACGT 
AATGGGTTTGAAAGATATTGGGTCCAAATTGCCACCGGGGTTTCGATTTCATCCAAGTGA 
TGAAGAGTTGGTTTSTCATTATCTTTGCAACAAGATTAGGGCCAAATCTGATGATGGTGA 
TGTTGATGATGATGATGATGATGTTGATGAAGCTTTGAAGGGTTCTACTGATCTTGTGGA 
GATTGACTTGCATATCTGTGAGCCATGGGAGCTTCCTGATGTGGCAAAGTTAAACGCAAA 
GGAATGGTACTTCTTCAGTTTCCGTGATCGAAAGTATGCTACTGGATATCGCACGAACAG 
AGCGACAGTAAGCGGATACTGGAAAGCAACAGGAAAAGATCGAACGGTGATGGATCCACG 
TACAAGGC^UlTTGGTAGGGATGAGAAAAAC^CTAGTGTTCTAi^GAAACAGAGCACCA^ 
TGGGATCAAAACTACTTGGATCATGCACGAGTTCCGTCTTGAGTGTCCTAACATCCCACA 
TAAGGAAGACTGGGTCTTGTGCAGAGTGTTCAACAAAGGCAGAGACTCATCGCTACAAGA 
CAATAATTATTATAACAATGATAATCAGACGCAAAGGCTTGAAGTTAATGACGCTCCGGA 
TCTTAATTACAACAATCAGTTGCCACCTTTGCTATCATCCCCTCCTCATAATCATCAACA 
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TGAGAAGATGAAAATCCAAGTTTGTGATCAGTGGGAGCAGCTAATGAAGCAGCCTTCAAG 

GACCACCGGCCACCCCTATCATCACCATTGTCATCATCAAACCATAGCATGTGGTTGGGA 

GCAGATGATGATCGGTTCGCTGTCATCACCTTCGAGTCATGGCCCTGATCACGAGTCCTT 

TGCTAAATTTGCTTTACCGTCGACAATAACAACAGTGTCAACATCAGTGGTGATCATCAT 

CAGAATTATGAGAAGATTTTGTTGTCATCACTAGACATGACGAGTTTGGATCACGACAAG 

ACATGTATGGGATCATCATCGGATGGTGGTATGGTCTCTGATCTTCACATGGAATGTGGT 

GGATTGAGTTTTGAGACCGAAAATATCCTCGCTTTCCAATGAACATAATTCAAGGGGTTC 

GCCAATTTGTTGATTCGTGAATTATACAAACATTTTATCTATAGATTTATCACATTATCA 

AACATGTAAGTTGTGTGGCATTTGGGTATAGGGTTTGTGTGATTCTAGGTTTTTAGGACG 

ATGTATGTTGTTATATTTAGCGTGTTTTTAGGATTTATTCTCATTTTAAAATTATA 

AACCGATTACT'ATGAATACAATTAGTTTTCTTTGTTGTAAATAATATTTTAGATTATCAA 

AAAAAAAAAAAAAA 

>G509 Amino Acid Sequence (domain in AA coordinates: 13-169) 
MGLKDIGSKLPPGFRFHPSDEELVCHYIiOtfKI 

IDLH I CEPWELPDVAKLNAKEW YF FS FRDRKYATGYRTNRATVS GYWKATGKDRTVMDPR 
TRQLVGMRKTIjVFYRNRAPNG IKTTW I MHE FRIiECPNI PHKEDWVTiCRVFNKGRDS SLQD 
ISramTNlTONQTQRLEVNDAPDL 

TTGHPYHHHCHHQTIACGWEQMMIGSLSSPSSHGPDHESFAKFAIjPSTITTVSTSVVI 1 1 

RIMRRFCCHH* 

>G519 (85.-894) 

CACAAAGATCCTCCGATTCGAAGGTTTATAAAAACTC^AATCGAATCTTATCCACAAGA 

AAACAACAAGGTACTTTTCCAAAAATGAAGGCGGAGTTGAATTTGCCGGCGGGAT 

TTTCATCCGACGGACGAAGAGCTTGTCAAGTTCTATCTTTGCCGGAGATGTGCGTCAGAA 

CCGATTAACGTTCCGGTTATCGCAGAGATTGACTTGTACAAATTCAATCCATGGGAGCTT 

CCAGAAATGGCGTTGTACGGTGAGAAAGAATGGTACTTCTTCTCGCATAGAGACCGGAAA 

TACCGZ^AACGGGTCGAGACCAAACCGGGCAGCTGGAACCGGTTATTGGAAAGCGACTGGA 

GCTGATAAACCGATCGGAAAACCGAAGACGTTAGGGATTAAGAAAGCACTCGTCTTCTAC 

GCAGGAAAAGCTCCGAAAGGGATTAAAACGAATTGGATTATGCACGAGTATCGTCTCGCT 

AATGTCGATCGATCTGCTTCTACCAACAAGAAGAACAACTTAAGACTTGATGATTGGGTT 

TTGTGTCGGATATACAATAAGAAAGGAACAATGGAGAAGTATTTACCGGCGGCGGCTGAG 

AAACCGACGGAAAAGATGAGTACGTCGGACTCAAGATGCTCAAGTCACGTGATTTCACCG 

GACGTCACGTGTTCTGATAACTGGGAGGTTGAGAGTGAGCCCAAATGGATTAATCTGGAA 

GACGCGTTAGAGGCATTTAATGATGACACGTCCATGTTTAGTTCCATTGGTTTGTTGCAA / 

AATGACGCCTTTGTTCCTCAGTTTCAGTACCAGTCCTCCGATTTCGTCGATTCGTTTCAG 

GACCCGTTCGAGCAGAAACCGTTCTTGAATTGGAATTTTGCTCCTCAAGGGTAAAAATAA 

TCGGCAAAAAGTTGAAGCTTTTCAGAGTCTTCGATCACCGGCATTGTGTCGGATCCTGAC 

CCGGAGACCAAGTCGGGTCATACGATTACATAATCGGGTTATTGAGATTTCCACATTTGG 

ATTTCCGAGACTAACCAACTTAACGGATTCTCGGGTAATTGGGGGGTTTTGCACAGGTGA 

ATCACACTGAGTCAGCAAGTTTCGATTTTTTGGTTTTGTTTTGTAATGATTGATTAAATG 

TCTAAAGATATGACGAAGTAGATTCAGAAGAACTGTAAAAGCAATTGTGACGACCCGTTA 

TGAATCATAAATATATTCAATGAAGCATGAGCTTATTTTTTTTTTAAAAAAAAA 

>G519 Amino Acid Sequence (conserved domain in AA coordinates: 11-104) 
MKAELNLPAGFRFHPTDEEIiVKFYLCRRCASEPI 

KE WY F F S HRDRKYPNGSRPNRAAGTG YV7KATGADKP I G KP KTLG I KKAL VF YAGKAP KGI 
KTNWIMHEYRLANVDRSASTNKKNNIjRL^ 

SDSRCS SHVI S PDVTCSDNWEVESEPKW INLEDALEAFNDDTSMFS S IGLLQNDAFVPQF 
QYQSSDFVDSFQDPFEQKPFLNWNFAPQG* 

>G561 (86..nee) 

AATTTGTTTTTTTTTCTTTTGTGGGTTCAATTCGAATTGTTTTCCCTGAGACTCAAGTTA 
CTGTGTCATTACTCTGCATTGAGCAATGGGTAGCAACGAAGAAGGAAACCCCACTAACAA 
CTCTGATAAGCCATCGCAAGCTGCTGCTCCTGAGCAGAGTAATGTTCATGTGTATCATCA 
TGACTGGGCTGCTATGCAGGCATATTATGGGCCTAGAGTTGGTATACCTCAATATTACAA 
CTCAAATTTGGCGCCTGGTCATGCTCCACCGCCTTATATGTGGGCGTCTCCATCGCCAAT 
GATGGCTCCTTATGGAGCACCATATCCACCATTTTGCCCTCCTGGTGGAGTTTATGCTCA 
TCCTGGTGTTCAAATGGGCTCACAACCACAAGGTC 

TACAACCCCTTTGACCATTGATGCACCAGCTAATTCAGCTGGAAACTCAGATCATGGGTT 
CATGAAAAAGCTGAAAGAGTTCGATGGACTTGCAATGTCAATAAGCAATAACAAAGTTGG 
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GAGTGCTGAACATAGCAGCAGTGAACATAGGAGTTCTCAGAGCTCCGAGAATGATGGCTC 
TAGCAATGGTAGTGATGGTAATACAACTGGGGGAGAACAATCTAGGAGGAAAAGAAGGCA 
ACAAAGATCACCAAGCACTGGTGAAAGACCCTCATCTCAAAACAGTCTGCCTCTTAGAGG 
TGAAAATGAGAAACCCGATGTGACTATGGGGACTCCTGTTATGCCCACAGCAATGAGTTT 
CCAAAACTCTGCTGGCATGAACGGTGTGCCACAGCCATGGAATGAAAAAGAGGTTAAACG 
AGAGAAGAGAAAACAGTCAAACCGAGAATCTGCTAGGAGGTCAAGACTGAGGAAGCAGGC 
TGAAACAGAACAACTATOTGTCAAAGTTGACGCATTAGTAGCTGAGAACATGTCTCTGAG 
GTCTAAACTAGGCCAGCTAAACAATGAGTCTGAGAAACTACGGCTGGAGAACGAAGCTAT 
ATTGGATCAACTGAAAGCGCAAGCAACAGGGAAAACAGAGAACCTGATCTCTCGAGTTGA 
TAAGAAGAACTCTGTATCAGGTAGCAAAACTGTGCA^ 

GATAACCGATCCTGTCGCGGCTAGCTGACCGTGGCCGCAACAATGAGAACCCGATATTTC 

TTCCTTTGGGTTGTGATTGTAACTTAAAAGGAGACTTTTTGTTTTTATTCTTAGATTTGT 

AGCTCTCTGCATAGTGAGCATAAATTGATGTAATATGGTTTAAGAGATTCGGTGTTCTCT 

GGTGTGTGCTGCAACCACATAATTGGTGATAGATAGGTTTAGTTATATAAGCAAATGT 

TAGAGATAAGGGGAGACATATTTGATGGTCTTT 

>G561 Amino Acid Sequence (domain in AA coordinates: 248-308) 

MGSNEEGNPTl^SDKPS QAAAPEQSNVHVYHHDWAAMQAYYGPRVG I PQYYNSNLAPGHA 

PPPYMWASPSPMMAPYGAPYPPFCPPGGVYAHPGVQMGSQPQGPVSQSASGVTTPLTIDA 

PANSAGNSDHGFMKKLKEFDGIiAMS I SNNKVG SAEHS S SEHRS S QS S ENDGS SNGSDGNT 

TGGEQSRRKRRQQRSPSTGERPSSQNSLPLRGEl^KPDVTMGTPVTyiPTAMSFQNSAGMNG 

VPQPWNEKEVKREKRKQSNRESARRSRLRKQAET^ 

ESEKLRLENEAILDQLKAQATGKTENLISRTO 

* 

>G590 (102.. 1223) 

TCGACAGACACTCTCCCTCTCTCCATGCCCATAAAATCTCAAAGACTGTTTAAAAAAAAA 
AATGTTTTAGCTTTAACTGCTTTTTTTTTGTTGTTGGTGTAATGATATCACAGAGAGAAG 
AAAGAGAAGAGAAGAAGCAGAGAGTGATGGGAGATAAGAAATTGATTTCATCTTCTTCTT 
CTTCCTCGGTTTACGATACTCGTATGAATC^ 

ACGAAATCTCTCAGTTTCTCCGGCATATTTTCGACCGTTCTTCTCCTTTACCTTCTTACT 
ACTCCCCGGCGACGACTACAACGACGGCGTCTTTGATTGGTGTGCACGGGAGCGGTGACC 
(^CATGCAGATAACTCGAGAAGTCTCGTTTCTCATCATCCACCGTCAGATTCTGTGCTTA 
TGTCGAAACGTGTCGGAGATTTCTCTGAGGTTTTAATCGGCGGAGGATCAGGCTCAGCCG 

GGACTCGAGTATCGTCTTCTTCCGTTGGAGCTAGTGGCAACGAGACAGATGAGTATGACT 
GTGAAAGCGAGGAAGGAGGAGAAGCTGTAGTTGATGAAGCTCCCTCTTCCAAGTCAGGTC 
CTTCTTCTCGTAGTTCATCTAAAAGATGCAGAGCTGCTGAAGTTCATAATCTCTCTGAGA 
AGAGGAGGAGAAGTAGAATTAATGAAAAAATGAAAGCTTTACAAAGTCTCATCCCTAATT 
CAAATAAGACGGATAAGGCTTCAATGCTTGATGAAGCCATTGAGTATCTGAAACAGCTTC 
AGCTCCAAGTTCAGATGTTGACTATGAGAAATGGAATAAACTTGCATCCTTTGTGTTTAC 
CTGGAACTACATTACACCCATTGCAACTCTCTCAGATTCGACCCCCTGAAGCAACCAATG 
ATCCTCTGCTTAATC^TACC^TCAGTTTGCTTCGACTTCTAATGC^CCGGAAATGATCA 
ATACTGTGGCTTCTTCATACGCTTTGGAACCTTCTATTCGCAGTCACTTTGGACCTTTCC 
CTCTCCTTACTTCACCCGTGGAGATGAGTCGGGAAGGTGGGTTAACTCATCCAAGGTTGA 
ACATTGGTCATTCCAACGCAAACATAACCGGGGAACAAGCTCTGTTTGATGGACAACCTG 
ACCTAAAAGATCGAATTACTTGAACAGTGTCCCAACTTCGGGATCTCTATGTGTTCTTGT 
TTCTTAGAACGCAAGCCATAAAGCTGTCTGAC 

>G590 Amino Acid Sequence (domain in AA coordinates: 202-254) 

MI SQREEREEKKQRVMGDKKLI S S S S S S S VYDTR INHHLHHPPSSSDEISQFLRHIFDRS 

SPLPSYYSPATTTTTASLIGVHGSGDPHADNSRSLVSHHPPSDSVLMSKRVGDFSEVLIG 

GGSGSAAACFGFSGGGNNNNVQGNSSGTRVSSSSVGASGNETDEYDCESEEGGEAVVDEA 

PSSKSGPSSRSSSKRCRAAEVHNLSEKRRRSRINEKMKALQSLIPNSNKTDKASMLDEAI 

EYLKQLQLQVQMLTMRN^ 

NAPEMINTVASSYALEPSIRSHFGPFPLLTSPVEMSREGGIiTHPRLNIGHSNANITGEQA 

IiFDGQPDLKDRIT* 

>G818 (65. .1060) 

GTATTTCTTACAATAAACGACCAAAAAGTTAATACAAGAAATAGAAACGGTGTAGGAAGC 
TACTATGACGGCAATTCCAAACGTCGTCGATATTGAATCTTCTTCCTCTTCGCTTTGTCA 
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AGAGACGGCAACGGAGACCGTCACCGTTGAAAGAGGCTCGTCTGATTCATCTTCAAAGCC 
AGACGACGTCGTTTTACTAATCAAGGAAGAGGAGGATGACGCGGTTAACTTGTCACTTGG 
TTTTTGGAAATTGCACGAGATAGGTTTAATAACACCGTTCTTGAGAAAGACGTTTGAGAT 
CGTCGATGACAAAGTAACAGACCCGGTTGTATCATGGAGCCCGACCCGTAAAAGCTTTAT 
CATTTGGGATTCTTACGAGTTCTCAGAGAATCTACTTCCCAAATACTTCAAGCACAAGAA 
CTTCTCCAGTTTTATTCGTCAGCTTAACTCTT^ 

GTGGGAATTTGCTAACGAAGGGTTTGAAGGAGGGAAGAAACATTTGCTTAAGAACATCAA 
GAGGAGAAGCAAAAACACTAAATGTTGTAACAAGGAAGCGAGTACCACCACGACAGAGAC 
TGAGGTTGAGTCATTGAAGGAGGAACAGAGTCCAATGAGATTGGAGATGTTGAAGCTGAA 
ACAACAACAAGAAGAATCTCAACATCAGATGGTCACTGTGCAGGAGAAGATCCACGGAGT 
TGATACCGAACAACAGCATATGCTTAGTTTCTTTGCAAAGTTGGCTAAAGATCAAAGATT 
TGTAGAGAGACTGGTGAAGAAGAGAAAGATGAAAATACAGAGAGAGCTAGAAGCAGCTGA 
ATTCGTGAAGAAGCTCAAGTTGCTTCAGGATCAAGAAACTGAAAAGAACTTGTTAGATGT 
AGAAAGAGAATTTATGGCCATGGCTGCAACAGAACACAATCCCGAGCCTGACATTTTGGT 
GAACAATCAAAGCGGGAATACGAGATGTCAGCTTAACTCAGAGGACCTACTTGTTGACGG 
TGGCTCAATGGATGTAAATGGGAGGATAGAGATAGAGTAGAGCAAAACCGGTAACATAGC 
AATAGAGAAGGTACCAAATCCCAAGGCTTGAGATCCGAAT 

>G818 Amino Acid Sequence (domain in AA coordinates: 70-162) 
MTAIPNWDIESSSSSLCQETATETVTVERGSSDSSSKP 

WKLHEIGLITPFLRKTFEIVDDKVTDPWSWSPTRKSFIIWDSYEFSENLLPKYFKHKNF 

SS FIRQLNSYGFKKVDSDRWEFANEGFQGGKKHLLKNIKE^SKNTKCCNKEASTTTTETE 

VESLKEEQSPMRLEMLKLKQQQEESQHQMVTVQEKIHGVDTEQQHMLSFFAKLAKDQRFV 

ERIiVKKRKMKIQRELEAAEFVKKLKLL^ 

NQSGNTRCQLNSEDLLVDGGSMDWGRIEIE* 

>G849 (218.. 2077) 

AACTCGAGAATTCTTCATTTCTTTTAAATCTTAGAATCTCGAGTTTTTGTATAAATCGAT 

TCTAATTTTTCCTTTGTACATTGTTTTATATATACATAAAACACACAAATCGGGTATGGG 

GGAATTTGGGTTTTAAGATAGCGTGATCTGTAATAATAAGTGGTTCGCGATCGTGATCAA 

GAAACTGGTGGCTGATAGTGATATGCATATTTGAGAGATGGTGTTCAAGAGAAAGTTAGA 

TTGCCTTTCCGTGGGATTTGATTTTCCCAAGATTCCCAGAGCTCCTCGTTCATGCAGGAG 

GAAGGTTCTAAACAAGAGGATTGATCATGATGATGATAACACTCAGATCTGTGCAATTGA" 

CTTACTAGCTTTGGCTGGAAAGATTCTACAGGAAAGCGAGAGTTCCTCTGCGTCTTCTAA 

TGCATTTGAAGAAATTAAGCAAGAGAAAGTAGAAAATTGCAAGACTATTAAATCTGAGTC 

TTCTGACCAAGGAAACTCTGTGTCAAAGCCTACTTATGATATCTCTACTGAGAAGTGTGT 

GGTGAACAGTTGTTTTTCATTTCCGGATAGTGACGGCGTTTTGGAGCGGACTCCGATGTC 

TGATTACAAGAAGATTCATGGTTTGATGGATGTAGGGTGTGAAAACAAGAATGTAAATAA 

TGGGTTCGAGCAAGGAGAAGCAACCGATCGCGTGGGTGATGGAGGCTTAGTCACTGATAG 

TTGCAACTTAGAGGATGCAACTGCGTTAGGTCTGCAGTTTCCGAAATCAGTCTGTGTGGG 

TGGTGATTTAAAATCACCATCGACCTTGGATATGACCCCTAATGGTTCCTATGCTAGACA 

TGGGAACCATACTAACCTAGGTAGAAAAGATGATGATGAAAAATTCTATAGTTACCATAA 

ACTTAGCAATAAATTTAAGTCGTATAGGTCTCCAACAATTCGAAGAATAAGAAAGTCGAT 

GTCGTCCAAATACTGGAAACAAGTTCCAAAAGATTTTGGATACAGTAGAGCTGATGTGGG 

TGTGAAGACTCTTTATCGCAAAAGAAAATCATGTTATGGTTACAACGCATGGCAGCGTGA 

GATCATTTATAAGAGAAGAAGATCACCTGACAGAAGCTCGGTCGTAACTTCTGATGGAGG 

ACTCAGTAGTGGAAGTGTTTCCAAGTTACCCAAGAAGGGAGATACAGTAAAGCTAAGCAT 

TAAGTCCTTTAGGATTCCAGAGCTTTTTATTGAAGTTCCAGAAACTGCAACAGTAGGATC 

ACTAAAGAGGACTGTGATGGAGGCTGTCAGTGTTTTACTCAGCGGAGGAATACGTGTTGG 

GGTGTTAATGCATG€GAAGAAGGTTAGAGATGAAAGGAAAACTCTGTCCCAGACTGGGAT 

CTCATGTGATGAAAATCTAGACAACCTTGGGTTCACCTTGGAGCCTAGTCCCAGCAAAGT 

TCCCCTACCTTTGTGTTCTGAAGATCCTGCTGTGCCAACCGACCCTACAAGTTTGTCTGA 

ACGGTCTGCGGCGTCTCCTATGCTAGATTCTGGAATTCCACATGCAGATGACGTGATTGA 

TTCAAGAAATATTGTGGACAGTAACCTCGAATTAGTTCCATATCAGGGTGACATATCTGT 

TGATGAACCTTCATCAGATTCAAAAGAGCTTC 

GCTTGCCATAGTTCCGTTGAACCAGAAACCTAAGCGTACTGAGCTAGCCCAGAGGAGAAC 
TAGGAGACCCTTCTCTGTGACAGAGGTAGAAGCTCTTGTACAAGCAGTTGAGGAACTCGG 
GACTGGAAGATGGCGTGATGTAAAATTGCGTGCTTTCGAGGATGCAGATCATCGGACTTA 
CGTGGACTTGAAGGACAAATGGAAGACGCTAGTTGACACAGCAAGTATATCCCCAGAGCA 
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ACGAAGAGGAGAGCCGGTGCCACAAGAACTGCTAGACAGAGTCTTGAGGGCATACGGGTA 
TTGGTCGCAGCACCT^AGGAAAACATCAGGCGAGAGGAGCGTCCAAAGATCCAGACATGAA 
CAGAGGTGGAGCTTTTGAATCAGGTGTTTCAGTGTAAAAAAGGAGGTACGCATTGGTGGG 
TGGGTGTACAGAAGCAAACAACACAATAAATGGACAACTCAATTTCTGCAAAGTTTAATT 
GTCTTTATTTCTCGTTTTTTTTTTTTTTTCTCCTACATACACTTTTTTTTTTCT 

>G849 Amino Acid Sequence (domain in AA coordinates: 324-413, 504-583) 
MVFKRKLDCLSVGFDFPNIPRAPRSCRRKVL 

ESSSASSNAFEEIKQEKVENCKTIKSESSDQGNSVSKPTYDISTEKCWNSCFSFPDSDG 

VLERTPMSDYKKIHGLMDVGCENKN^ 

FPKSVCVGGDLKSPSTLDMTPNGSYARHGl^TNLGRKD^ 

IRRIRKSMSSKYWKQVPKDFGYSI^ADVGVKTLYRKRKSCYGYNAWQREIIYKRRRSPDRS 
SVVTSDGGLSSGSVSKLPKKGDTVKLSIK5FRIPELFIEVPETATVGSLKRTVMEAVSVL 
LSGGIRVGVIjMHGKKVRDERKTLSQTGISCDE 

TDPTSLSERSAASPMLDSGIPHADDVIDSRNIVDSNLELVPYQGDISVDEPSSDSKELVP 
LPELEVXAX.AIVPLNQK^KRTELAQRRT^ 

EDADHRTYVDLKDKWKTLVHTASISPQQRRGEPVPQELLDRVLRAYGYWSQHQGKHQARG 

ASKDPDMNRGGAFESGVSV* 

>G892 (21.. 1004) 

TATAACAATTCCTTCCAACAATGTCATTGAGTCAGCCAATAACACGGACCGATAGTGCAC 
CCAATGGAGCATTTAGGACTTTTGGTCTCTACTGGTGCTACCATTGTGATCGTATGGTCA 
GAATTGCATCCTCTAACCCATCAGAGATCGCCTGTCCTCGATGTTTGAGGCAATTTGTCG 
TTGAGATTGAAACGAGACAACGGCCTCGGTTTACTTTCAACCATGCTACTCCGCCTTTTG 
* ATGCTTCTCCTGAGGCTCGTCTTCTCGAAGCTCTCTCGCTCATGTTTGAGCCTGCAACCA 
TAGGTAGGTTTGGTGCAGACCCATTTCTTAGGGCAAGATCCAGAAACATCTTGGAACCTG 
AATCAAGACCCCGACCGCAACATCGAAGACGACACAGCCTTGACAATGTTAACAATGGTG 
GTTTACCTCTACCAAGAAGAACATATGTTATTCTCCGGCCCAATAATCCGACTAGTCCAC 
TCGGAAACATAATTGCGCGA.CCAAATCAAGCACCACCACGGCATGT 
ACTTTACTGGAGCATCAAGCTTAGAGC^ 

CTGGACCACCACCTGCGTCAGAACCCACCATTAATTCCCTACCATCTGTGAAAATAACAC 

CACAACATCTAACTAACGACATGTCCCAATGCACAGTGTGCATGGAAGAATTCATTGTTG 

GTGGGGACGCAACGGAATTACCATGTAAACATATTTACCATAAAGATTGTATAGTCCCGT 

GGCTTAGGCTTAACAATTCTTGCCCTATCTGCCGCCGTGACCTGCCACTTGTCAACACCG 

TTGCTGAATCTCGAGAAAGGAGCAATCCTATTAGACAAGACATGCCTGAAAGAAGGCGTC 

CAAGGTGGATGCAACTCGGTAACATTTGGCCATTTAGAGCAAGATACCAAAGGGTTAGTC 

CAGAAGAAA(^GC^AACCAGAATCCTCGAGATAACAGGAGCTAACTCTGAATATTCCATG 

GGAAATAAAAATCGTGACTATCTATATGTATAGACTCTATGAGACATTGTCTATTTGAAT 

GTGCATGTATATCTCAGAAATAAACTCAAGCGAAACATATTTAACGACTAAAAAAAA 

>G892 Amino Acid Sequence (domain in AA coordinates: 177-270) 

MSLSQPITRTDSAPNGAFRTFGLYWCYHCDRMVRIASSNPSEIACPRCLRQFVVEIETRQ 

RPRFTFlJHATPPFDASPEARLLEAIiSIiMFEPATIGRFGADPFLRARSRNILEPESRPRPQ 

HRRRHSLDNVimGGLPIjPRRTYyiLRPIOTPTSPLGNIIAP 

LEQLIEQLTQDDRPGPPPASEPTINSLPSVKITPQHLTNDMSQCTVCMEEFIVGGDATEL 
PCKHIYHKDCIVPV^RLNNSCPICRRDL 
WI WPFRARYQRVS PEETANQNPRDNRS * 
>G961 (1..1200) 

ATGTCAAAATCTATGAGCATATCAGTGAACGGACAATCTCAAGTGCCTCCTGGGTTTAGG 
TTTCATCCGACCGAGGAAGAGCTGTTGCAGTATTATCTCCGGAAGAAAGTTAATAGCATC 
GAGATCGATCTTGATGTCATTCGCGACGTTGATCTCAACAAGCTCGAGCCTTGGGACATT 
CAAGAGATGTGTAAAATAGGAAXI^CGCCACAAAAC^ 

GACAAAAAATATCCGACGGGAACGAGAACTAACAGAGCCACTGCGGCTGGATTTTGGAAA 
GCAACTGGCCGCGACAAGATCATATATAGCAATGGCCGTAGAATTGGGATGAGAAAGACT 
CTTGTTTTCTACAAAGGCCGAGCTCCTCACGGCCAAAAATCTGATTGGATCATGCATGAA 
TATAGACTCGATGACAACATTATTTCCCCCGAGGATGTCACCGTTCATGAGGTCGTGAGT 
ATTATAGGGGAAGCATCACAAGACGAAGGATGGGTGGTGTGTCGTATTTTCAAGAA.GAAG 
AATCTTCACAAAACCCTAAACAGTCCCGTCGGAGGAGCTTCGCTGAGCGGCGGCGGAGAT 
ACGCCGAAGACGACATCATCTCAGATCTTCAACGAGGATACTCTC.GACCAATTTCTTGAA 
CTTATGGGGAGATCTTGTAAAGAAGAGCTAAATCTTGACCCTTTCATGAAACTCCCAAAC 
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CTCGAAAGCCCTAACAGTCAGGCAATCAACAACTGCCACGTAAGCTCTCCCGACACTAAT 

CATAATATCCACGTCAGCAACGTGGTCGACACTAGCTTTGTTACTAGCTGGGCGGCTTTA 

GACCGCCTCGTGGCCTCGCAGCTTAACGGACCCACATCATATTCAATTACAGCCGTCAAT 

GAGAGCCACGTGGGCCATGATCATCTCGCTTTGCCTTCCGTCCGATCTCCGTACCCCAGC 

CTAAACCGGTCCGCTTCGTACCACGCCGGTTTAACACAGG7UVTATACACCGGAGATGGAG 

CTATGGAATACGACGACGTCGTCTCTATCGTCATCGCCTGGCCCATTTTGTCACGTGTCG 

AATGTTTTGCTGCTTGTTTGTCTCCTTCGTCTGCAGCTTCAGTTCTGGCCGTTCCAACCA 

TGGCAGAGGCAGGTTCATTTCGATCTTTCATCGCCTCAGATGCAGATCTCTCTCCATTGA 

>G961 Amino Acid Sequence (conserved domain in AA coordinates: 15-140) 

MSKSMSISVNGQSQVPPGFRFHPTEEELLQYYLRK 

QEMCKIGTTPQ3STDWYFFSHKDKKYPTGTRTNRATAAGFWKATGRDKJIYSN 
LVFYKGRAPHGQKSDWIMHE YRLDDNI I S PEDVTVHE WS 1 1 GEASQD EGWWCRI FKKK 
NI^KTLNSPVGGASLSGGGDTPKTTSSQIFNEDT 

LESPNSQAJNNCIWSSPDTNHNIHVSNVVDTSFVTSWAAIjDRLiVASQLNGPTSYSITAW 
ESHVGHDHLALPS VRS PYPSLNRS AS YHAGLTQE YTPEMELWNTTTSSLS S S PGPFCHVS 
NVXiLIiVCLI»RL»QIiQFWPFQPWQRQVHFDLSSPQMQISLH* 
>G1465 (163. .1125) 

TATCCTTCGCAAGACCCTTCCTCTATATAAGGAAGTTCATTTCATTTGGAGAGGACACGC 
TGACAAGCTGACTCTAGCTTATCTGGTACCGTCGACCTCATT'CTTGCGTTTGATCTTTCT 
TTCTCTAGATCCCATATTTTTCTTGATCAATTTAGTTTCATTATGGAGGAAGATGCAGCT 
TTTGATCTACTCAAAGCCGAACTCTTAAACGCAGAAGACGATGCAATAATCTCACGTTAT 
CTGAAGCGTATGGTCGTCAACGGAGACTCATGGCCTGATCACTTCATCGAAGACGCAGAC 
-GTGTTCAACAAGAATCCAAATGTGGAGTTCGATGCTGAGAGCCCTAGCTTCGTGATAGTT 
AAACCTCGAACAGAGGCTTGTGGTAAAACCGATGGATGTGAAACTGGTTGCTGGAGGATC 
ATGGGTCGTGATAAACCGATAAAATCGACGGAGACTGTGAAGATTCAAGGGTTCAAGAAG 
ATTCTCAAGTTCTGCCTAAAGAGGAAACCTAGAGGATACAAGAGAAGTTGGGTAATGGAA 
GAGTATAGGCTTACCAATAACTTGAACTGGAAGCAAGATCATGTGATTTGCAAGATTCGG 
TTTATGTTTGAAGCTGAAATCAGTTTCTTGCTAGCCAAGCATTTCTACACTACATCAGAA 
TCACTTCCTCGAAATGAGCTGTTGCCAGCTTACGGATTCCTTTCATCAGATAAGCAATTG 
GAGGATGTATCTTATCCGGTGACGATAATGACTTCTGAAGGAAACGATTGGCCTAGCTAC 
GTTACCAACAATGTGTATTGTCTGCATCCATTGGAGCTCGTTGATCTTCAAGATCGGATG 
TTTAATGATTACGGAACCTGCATCTTCGCTAACAAGACTTGTGGTAAAACCGATAGATGC 
ATTAATGGTGGTTACTGGAAAATTTTGCACCGTGATAGGCTGATCAAGTCAAAGTCCGGG 
ATAGTTATXGGTTTCAAGAAGGTGTTTAAGTTTCATGAAACGGAGAAAGAAAGATACTTC 
TGTGGTGGAGAAGATGTGAAGGTAACTTGGACTCTAGAAGAGTATAGGCTTAGCGTGAAG 
CAGAATAAATTCTTGTGCGTTATCAAGTTTACTTATGATAACTAAGAATCTTTTCTTTGG 
ATTTTATGATCATCTTAGTATCGCGACCGCTCTAGACAGGCCTCGTACCGGATCCTCTAG 
CTAGAGCTTTCGTTCGTATCATCGGTTTCGACAACG 

>G1465 Amino Acid Sequence (conserved domain in AA coordinates: 242-306) 

MEEDAAFDLLKAELLNAEDDAI I SRYLKRMWNGD S WPDHF I ED AD VFNKNPNVE FDAES 

PSFVIVKPRTEACGKTDGCETGCWRIMGRDKPIK5TETV1CIQGFKKILKFCLKRKPRGYK 

RSWVMEEYRLTNNLmfKQDHVICKIRFMFEAEISFLIaAKHFYTTSESLPRNEL 

SSDKQLEDVSYPVTIMTSEGlTOWPSYWNNVYCLHPIiEIiVIDLQDRMFNDYGTCIFANKTC 

GKTDRCINGGYTtfKILHRDRLIKSKSGIVIGFK^ 

YRLS VKQNKFLCVI KFTYDN* 

>G425 (45.. 1196) 

GAAAACAGTCTTCTCTTCTCCGATCCCAAAAACGCAGGAAAACAATGTCGTTTAACAGCTCCC 

ACCTCCTTCCTCCACAAGAAGACCTTCCTCTCCGACACTTCACCGATCAATCACAGCAACCTC 

CGCCGCAGCGTCACTTCTCTGAAACACCTTCGCTTGTCACCGCCAGTTTCCTCAACCTCCCTA 

CCACCCTTACCACTGCGGATTCCGATCTCGCTCCTCCGCACCGCAACGGAGACAATTCCGTT 

GCTGATACAAACCCACGCTGGCTCTCCTTTCATTCGGAGATGCAAAATACTGGAGAAGTACG 

TTCTGAAGTTATCGACGGAGTCAACGCCGATGGTGAAACGATACTCGGCGTTGTAGGAGGT 

GAAGATTGGCGGAGTGCTAGCTATAAGGCGGCGATTTTAAGACATCCGATGTACGAGCAGC 

TTCTTGCGGCTCACGTGGCTTGCCTTAGGGTTGCGACTCCCGTTGACCAGATTCCGAGGATC 

GATGCTCAGCTCAGTCAGTTGCATACCGTCGCCGCGAAATACTCCACTCTTGGTGTGGTTGTT 

GACAACAAGGAACTTGATCATTTCATGTCACATTATGTTGTCTTGTTATGTTCATTTAAAGAACA 

ACTCCAACACCACGTTTGTGTCCATGCAATGGAAGCCATTACGGCTTGTTGGGAGATTGAACA 
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ATCACTGCAATCCCTAACTGGAGTTTCTCCAAGTGAAAGTAATGGTAAGACAATGTCGGATGA 

TGAAGATGATAATCAAGTAGAGAGCGAGGTGAACATGTTTGATGGAAGTTTGGACGGTTCAG 

ATTGCTTGATGGGGTTTGGTCCTCTTGTTCCAACCGAGAGAGAGAGATCTTTGATGGAACGTG 

TGAAGAAAGAACTGAAGCATGAGCTTAAACAGGGTTTCAAAGAGAAGATTGTGGACATAAG 

AGAAGAGATAATGAGGAAGAGAAGGGCGGGAAAGTTGCCAGGAGATACGACTTCTGTACT 

CAAAGAATGGTGGCGAACTCACTCGAAATGGCCATACCCAACTGAGGAAGATAAGGCAAAA 

G7VAACTGGAACAGCAACTCTTCCACGTCATCTACTCTCACCAAGAACAAACGTAAACGGACC 

GGGAAGTCGTAGGTGACATAGCGGCTAACTAGAGGATGGTTCTTTGCCATGTGAATTCTTGG 

GAACCGTATATGAAAGAAACGAATCCGGTTCTATGCTCGTACAGAGTGTGTTATTTGTATAGT 

GGATACCGGTTAGCCTATGAAACCGGATTCTGGAGTCCAAATTGTTGTTTGTAACGACTTAGT 

AGTTTTTGGAAGTGATCTGTTTCGTTGGTTTGCGTCTTGTAACGAACGCTTAAGCAAGTGTGGG 

TTTTTTCTTGTAAAGTGTCAATATGTTCGTTTGT 

AAACTAGCTTGAAATGTAAAAAAAAAAAAAAAA 

>G425 Amino Acid Sequence (domain in AA coordinates: TBD) 
MSFNSSHIiLPPQEDLPLRHFTDQSQQPPPQRHFSETPSLV^ 

NGDNSVADTNPRWLSFHSEMQNTGEVRSEVIDGWADGETILGVVGGEDWRSASYKAAIIjR 

HPMYEQLLAAHVACLRVATPVDQIPRIDAQLSQLHTVAAKYSTLGVVVDN^ 

LCSFKEQLQHHVCVHAME^ITACWEIEQSLQSL^ 

FDGSLDGSDCLMGFGPLVPTERERSLMERVKKELKHELK^ 

GDTTSVLKEWWRTHSKWPYPTEEDIOUaJVQETG^ 

KNKRKRTGKS* 

>G347 (1. .570) 

atgaaag'tagcagatatgcaggaccagctggtgtgtcatggttgtaggaatttattgatg 
tatcctagaggagcatctaatgtgcgttgtgcgttatgtaacactatcaacatggttcct 
cctcctcctccacctcacgacatggcacacattatatgtggtggttgtagaacaatgctt 
atgtatacgcgtggggctagtagcgtaagatgctcttgctgtcaaactacgaaccttgtg 
ccagcgcactccaatcaggttgcccatgctccttccagtcaggttgcgcagatcaattgt 
gggcattgtcggacgaccctcatgtatccttacggtgcatcatccgtcaaatgcgctgtt 
tgtcaattcgtaactaacgttaatatgagcaatggaagggtacctctcccaactaaccgg 
ccaaatggaacagcttgtcccccctctacatcaacttcaacaccaccctctcagacccaa 
accgttgttgtagaaaaccccatgtccgttgatgaaagcggaaagttggtgagcaatgtt 
gttgttggagtgacaactgacaaaaagtaa 

>G347 Amino Acid Sequence (domain in AA coordinates: 9-39, 50-70, 80-127) 

MKVADMQDQLVCHGCRNLLMYPRGASNWGA^ I CGGCRTML 

MYTRGASSVRCSCCQTTNLVPAHSNQVAHAPSSQVAQINCGHCRTTLMYPYGASSVKCAV 

CQFVTNVNMSNGRVPLPTNRPNGTACPPSTSTSTPP 

WGVTTDKK* 

>G1512 (1..732) 

ATGGAAGGGAACTTCTTCATCAGGTCTGATGCTCAACGAGCACATGACAATGGCTTCATA 
GCCAAACAAAAACCTAATCTCACCACGGCTCCAACAGCAGGTCAAGCTAATGAAAGTGGC 
TGTTTTGACTGCAACATCTGTTTAGACACAGCCCATGATCCGGTGGTCACTCTCTGCGGG 
CACCTTTTCTGCTGGCCTTGCATTTACAAGTGGTTACATGTTCAGTTATCTTCTGTCTCC 
GTTGATCAGCACCAGAACAATTGCCCTGTTTGTAAATCCAACATTACTATCACCTCTTTG 
GTTCCTCTCTATGGAAGAGGCATGTCTTCGCCTTCTTCCACGTTTGGCTCCAAGAAACAA 
GACGCACTGTCCACTGACATACCCCGCAGACCTGCTCCATCAGCCTTACGCAATCCGATT 
ACGTCAGCATCATCTCTGAACCCAAGCTTGCAACATCAAACTCTGTCTCCTTCATTTCAT 
AATCATCAGTATTCCCCTCGTGGCTTCACCACAACCGAATCAACCGACCTTGCCAATGCT 
GTAATGATGAGTTTCCTCTACCCTGTGATTGGAATGTTTGGAGACCTGGTCTACACCAGG 
ATATTCGGGACCTTCACAAACACAATAGCTCAGCCTTACCAAAGCCAGAGGATGATGCAG 
CGTGAGAAGTCTCTTAATCGGGTATCGATATTCTTCCTTTGTTGCATCATCCTTTGCCTC 
CTTCTCTTCTAG 

>G1512 Amino Acid Sequence (domain in AA coordinates: 39-93) 
MEGNFFIRSDAQRAHDNGFIAKQKPNLTTAPTAGQANESGCFDCNICLDTAHDPVVTLCG 
HLFCWPCIYKWLHVQLSSVSVDQHQ3SnsrCPVCKSNITITSIiVPriYGRGMSSPSSTFGSKKQ 
DALSTDIPRRPAPSAIjRNPITSASSIiNPSLQHQTLSPS FHNHQYSPRGFTTTESTDIiANA 
VMMSFLYPVIGMFGDLVYTRIFGTFTNTIAQPYQSQRMMQREKSLNRVSIFFLCCIIIiCL. 
LLF* 
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>G2069 (1..1026) 

ATGGAAGGAGGAGGAAGAGGACCAAATCAAACGATTCTCAGTGAAATAGAACATATGCCT 

GAAGCTCCACGTCAACGTATCTCTCATCACCGTCGAGCTCGCTCTGAAACCTTCTTCTCC 

GGCGAATCAATCGACGATCTCCTCTTATTCGATCCTTCCGATATCGATTTCTCTTCTCTA 

GACTTCCTCAACGCTCCACCACCACCACAACAATCACAACAACAACCGCAAGCTTCTCCC 

ATGTCCGTTGATTCGGAAGAAACCTCATCGAACGGTGTTGTTCCTCCTAATTCTCTTCCT 

CCAAAACCCGAAGCTAGATTCGGTCGCCATGTTCGTAGCTTCTCGGTTGATTCCGATTTC • 

TTCGATGATTTGGGTGTTACTGAGGAGAAGTTTATAGCTACAAGTTCAGGAGAGAAGAAG 

AAAGGGAATCATCATCATAGCAGGAGTAATTCTATGGATGGAGAGATGAGTTCGGCGTCG 

TTTAATATCGAATCGATTTTAGCTTCTGTGAGTGGTAAAGATAGTGGGAAGAAGAATATG 

GGTATGGGTGGTGATAGACTTGCTGAGCTTGCTTTGCTTGATCCTAAAAGAGCTAAAAGG 

ATTTTAGCGAATAGACAATCTG CGGCGAGGTCGAAAGAGAGGAAGATTAGGTATACTGG T 

GAGTTAGAGAGGAAGGTTCAGACACTTCAGAATGAAGCTACTACATTGTCTGCTCAAGTC^ 

ACTATGTTACAGAGAGGAACATCAGAGCTGAACACTGAAAATAAACACCTCAAAATGCGG 

CTTCAAGCTTTAGAGCAACAAGCTGAACTTAGGGATGCTTTGAATGAAGCGCTGCGGGAT 

GAACTGAACCGACTTAAGGTGGTAGCTGGAGAAATTCCTCAGGGGAATGGAAATTCTTAC 

AACCGTGCTCAATTCTCATCTCAGCAATCGGCAATGAATCAGTTTGGGAACAAAACGAAC 

CAACAGATGAGTACAAACGGGCAGCCATCGCTCCCAAGCTACATGGATTTCACCAAGAGA 

GGCTGA 

>G2069 Amino Acid Sequence (domain in AA coordinates: TBD) 
MEGGGRGPNQTILSEIEHMPEAPRQRISHHRRARSETFFSGESXDDLIiLFDPSDIDFSSIi 
DFLHAPPPPQQSQQQPQASPMSVDSEETSSNGWPPNSLPPKPEARFGRHVRSFSVDSDF 
FDDLGVTEEKFmTSSGEKKKGNHHHSRSNSMDGE 

GMGGDRIiAEIiALIiD PKRAKR I L ANRQSAARS KERKI R YTGELERKVQTLQNEATTL S AQV 
TMIiQRGTSELOTKNKHLKMRLQALEQQAEIjRD I PQGNGNS Y 

NRAQFSSQQSAIWQFGNKTNQQMSTNGQPSIjPSYI^FTKRG* 
>G1852 (55.. 1857) 

CATCTGATCTGCTCTCGAAGACGAAAGCTTCGAGTACTGGTTGAAGCTAAAGCTATGGGA 
CACGTGAATCTACCTGC^TCAAAGCGTGGTAACCCTCGTCAATGGCGTCTCCTCGACATC 
GTAACCGCTGCTTTCTTCGGTATCGTACTTCTCTTCTTCATCCTTTTATTCACTCCTCTT 
GGTGATTCCATGGCGGCTTCTGGTCGGCAAACGCTGCTTCTCTCTACGGCGTCAGATCCG 
AGG CAACGGCAGCGATTAGTGACTTTGGTTGAAGCTGGTCAGC ATTTGCAAC CG ATCGAG 
TATTGTCCTGCGGAAGCTGTTGCTCATATGCCTTGTGAGGATCCGAGAAGGAATAGTCAG 
CTTAGTAGAGAGATGAATTTCTATAGGGAGAGACATTGTCCTTTGCCTGAGGAGACTCGG 
CTCTGTTTGATTCCTCCGCCTTCTGGTTATAAAATTCCTGTTCCGTGGCCTGAGAGTCTT 
CACAAGATTTGGCATGCAAACATGCCATATAACAAAATTGCTGACCGGAAAGGTCATCAA 
GGATGGATGAAAAGGGAAGGGGAATACTTTACTTTCCCAGGCGGTGGCACGATGTTTCCT 
GGCGGAGCTGGCCAATACATTGAAAAGCTTGCACAGTATATTCCGCTTAATGGTGGAACT 
TTGAGAACTGCTCTTGACATGGGATGCGGGGTAGCTAGTTTTGGAGGTACTCTACTATCT 
CAAGGCATTCTAGCCCTCTCATTTGCTCCAAGAGATTCACATAAATCTCAAATTCAGTTC 
GCTTTGGAAAGAGGAGTGCCTGCATTTGTTGCCATGCTTGGCACTCGTAGACTCCCCTTT 
CCTGCATACTCCTTTGACCTGATGCACTGTTCCCGATGTTTGATTCCTTTTACGGCTTAC 
AATGCAACTTACTTCATCGAAGTAGATAGGTTACTGCGCCCTGGAGGATATCTTGTAATC 
TCTGGCCCACCTGTACAATGGCCTAAACAAGACAAAGAATGGGCTGATCTTCAGGCGGTG 
GCTAGAGCTTTGTGCTATGAGCTAATTGCGGTTGATGGAAACACTGTCATCTGGAAGAAG 
CCTGTTGGAGATTCATGTCTACCTAGCCAGAATGAGTTTGGGCTTGAGTTGTGTGATGAG 
TCTGTTCCGCCAAGTGATGCATGGTATTTTAAATTGAAGAGGTGTGTTACCAGGCCATCA 
TCCGTCAAAGGAGAACACGCTTTGGGAACTATATCCAAGTGGCCGGAGAGGCTTACTAAA 
GTTCCTTCTAGGGCCATTGTCATGAAAAACGGATTGGATGTGTTTGAAGCAGATGCAAGG 
CGGTGGGCAAGACGCGTTGCTTATTACAGGGATTCTCTTAACTTGAAGCTGAAATCTCCA 
ACTGTCCGCAATGTCATGGACATGAACGCATTCTTCGGAGGCTTTGCAGCAACCCTTGCA 
TCTGATCCTGTGTGGGTTATGAATGTCATTCCAGCTCGGAAGCCATTAACTCTTGACGTG 
ATTTATGACAGAGGTCTCATCGGTGTTTACCATGATTGGTGTGAACCATTTTCAACATAT 
CCCCGCACGTATGATTTCATCCATGTATCAGGAATTGAATCACTGATAAAACGACAAGAC 
TCAAGCAAATCGAGGTGTAGCCTAGTAGATCTAATGGTAGAGATGGACAGAATATTACGT 
CCAdAAGGAAAGGTTGTGATCCGAGACTCTCCTGAGGTGCTAGATAAAGTCGCACGAATG 
GCTCATGCTGTAAGATGGTCTTCTTCCATACACGAGAAAGAACCTGAATCCCATGGAAGA 
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GAGAAGATTCTTATCGCAACCAAATCTCTCTGGAAATTGCCATCAAACTCCCACTGT^AGA 
CACAAAAGAAGAAGAAAAGAAGAAGCTCTTCTCAATCTTGTAGGTACTGTCACTTGCTCT 
CCAGCCC . 

>G1852 Amino Acid Sequence (domain in AA coordinates: 1-601) 

MGHVNLPASKRGNPRQWRIiLDIVTAAFFGIVLLFFILLFTPLGDSMAASGRQTLLLSTAS 

DPRQRQRLVTLVEAGQHLQPIEYCPAEAVAHMPCEDPRRNSQLSREMNFYRERHCPLPEB 

TPLCLIPPPSGYKI PVPWPESLHKIWHANMPYNKIADRKGHQGWMKREGEYFTFPGGGTM 

FPGGAGQYIEKLAQYIPLNGGTLRTALDMGCGVASFGGTLLSQGILAIjSFAPRDSHKSQI 

QFALERGVPAJ*VAMLGTRRIiPFPAYSFDLMHCSRCLIPFTAYNATYFIEVDRLLRPGGYIi 

VISGPPVQWPKQDKEWADLQAVARALCYELIAVDGNTVIWKKPVGDSCLPSQNEFGLELC 

DESVPPSDAWYFKX.KRCVTRPSSVKGEHALGTISKWPERLTKVPSRAIWKNGLDVFEAD 

ARRWARRVAYYRDSIiNIiiOjKS PTVRNVMDMNAFFGGFAATLASDPVWVMNVI PARKPLTIj 

DVIYDRGLIGVYHDVJCEPFSTYPRTYDFIHVSGIESLIKRQDSSKSRCSLVDLMVEMDRI 

LRPEGKWIRDSPEVIjDKVARMAHAWWSSS 

* 

>G1793 (59.. 1783) 

AGTGATTTATTGATTAACCCAAACACAAAATAAACAGATTTGACTCAAAAAGAAGAAAAT 
GAATTCTAACAACTGGCTTGGCTTTCCTCTTTCACCGAACAACTCTTCTTTGCCTCCTCA 
TGAATACAACCTTGGCTTGGTCAGCGACCATATGGACAACCCTTTTCAAACACAAGAGTG 
GAATATGATCAATCCACACGGTGGAGGAGGAGATGAAGGAGGAGAGGTTCCAAAAGTGGC 
CGATTTTCTCGGTGTGAGCAAACCGGACGAAAACCAATCCAACCACCTAGTAGCTTACAA 
CGACTCAGACTACTACTTCCATACCAATAGCTTGATGCCTAGCGTCCAATCAAACGATGT 
CGTTGTAGCAGCTTGTGACTCCAATACTCCTAACAACAGTAGCTATCATGAGCTTCAAGA 
GAGTGCTCACAATCTACAGTCACTTACTTTGTCCATGGGGACCACCGCTGGTAATAATGT 
TGTAGACAAAGCTTCACCATCCGAGACCACCGGGGATAACGCTAGCGGTGGAGCACTAGC 
CGTTGTTGAGACGGCCACGCCAAGACGTGCATTGGACACTTTCGGACAACGAACCTCGAT 
CTATCGTGGTGTCACAAGACATCGATGGACTGGTCGATATGAGGCTCATCTATGGGATAA 
TAGTTGTAGAAGGGAAGGCCAGTCTAGGAAAGGAAGACAAGTTTACTTGGGTGGATATGA 
CAAAGAAGATAAAGCAGCAAGATCATATGATCTAGCTGCACTTAAGTACTGGGGTCCTTC 
AACTACTACTAATTTCCCCATTACAAACTACGAGAAAGAAGTAGAGGAAATGAAGCACAT 
GACGAGACAAGAGTTCGTGGCTGCCATTAGAAGGAAAAGTAGTGGATTTTCGAGAGGCGC 
TTCGATGTATGGAGGAGTTACAAGGCATCACCAACATGGAAGATGGCAAGCAAGGATCGG 
CCGAGTCGCCGGAAACAAAGACCTCTACTTGGGAACTTTTAGCACTGAGGAAGAAGCAGC 
AGAAGCTTACGATATAGCTGCAATAAAGTTTAGAGGACTTAATGCAGTGACCAACTTCGA 
GATCAACCGGTACGACGTGAAAGCCATTCTAGAGAGTAGCACTCTTCCCATCGGAGGAGG 
CGCAGCTAAACGGCTCAAAGAAGCTCAAGCTCTTGAGTCTTCAAGGAAACGCGAGGCGGA 
GATGATAGCCCTTGGTTCAAGTTTCCAGTACGGTGGTGGCTCGAGCACAGGCTCTGGCTC 
CACCTCATCAAGACTTCAGCTTGAACCTTACCCTCTAAGCATTCAACAACCATTAGAGCC 
TTTTCTATCTCTTCAGAACAATGACATCTCTCATTACAACAACAACAATGCTCACGATTC 
CTCCTCTTTTAATCACCATAGCTATATCCAGACACAACTTCATCTCCACCAACAGACCAA 
CAATTACTTGCAGCAACAGTCGAGCCAGAACTCTCAGCAGCTCTACAATGCGTATCTTCA 
TAGCAATCCGGCTCTGCTTCATGGACTTGTCTCTACCTCTATCGTTGACAACAATAATAA 
CAATGGAGGCTCTAGTGGGAGCTACAACACTGCAGCATTTCTTGGGAACCACGGTATTGG 
TATTGGGTCCAGCTCGACTGTTGGATCGACCGAGGAGTTTCCAACCGTTAAAACAGATTA 
CGATATGCCTTCCAGTGATGGAACCGGAGGGTATAGTGGTTGGACCAGTGAGTCTGTTCA 
GGGGTCAAACCCTGGTGGTGTTTTCACTATGTGGAATGAGTAAACAAGGATCTCTTTCTT 
GCGGCACAAGGAATGGGT 

>G1793 Amino Acid Sequence (conserved domain in AA coordinates : 179-255 , 281-349) 
MNSNNWLGFPLSPl^SSLPPHEYNLGL^ 

ADFLGVSKPDENQSimLVAYiroSDYYFHTNSLMPSVQSNDWVAACDSNTPNN 
ESAHmQSLTLSMGTTAGNTS^ 

IYRGVTRHRWTGRYEAHLWDNSCRREGQSRKGRQVYL.GGYDKEDKAARSYDLAAB 

STTTNFPITNYEKEVBEMKHMTRQEFVAAIRRKSSGFSRGASMYRGV 

GRVAGNKDLYLGTFSTEEEAAEAYDIAAIKFRGLNAVT^EINRYDVKAILESSTLPIGG 

GAAKRLKEAQALESSRKREAEMIALGSSFQYGGGSSTGSGSTSSRLQLQPYPLSIQQPLE 

P FLSLQNND I SH YNlNfNNAHD S S S FNHHS YI QTQLHLHQQTNNYIjQQQS S QNS QQL YNAYL 

HSNPALLHGLVSTSIVDNICTSn^ 
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YDMPSSDGTGGYSGWTSESVQGSNPGGWTMWNE* 
>G761 (521. .1549 ) 

GGGGCCGACCGGCCGCCCGGGCAGGTCTAGGTTCAAAAGGACTCACAAGAGAGAGATAGT 

ATGATTGATAGGGAAAGAGAGAGAGATGAAAGAAAGTAAAATATATAATAGATTATTAGG 

ACACGAGTGTCATCTTTTGATTTGTGTCTTGTGTGCTCTCTCTTTCTTCTCTTCCTCGAA 

TGATCATCTTTATATAACCCTACTCTCTTTCTCTTTTCCCATTCTTTCATATCATTCTCC 

CTTTCTCTCTCGGGATCTGATCTCTCTTTCCAGTAACCTATTCCCGAGGAGCACTGTCAA 

ATCTTGTCCACTCTTTGATCTTATCTCGATCTCTTTCTCTTTCTAGTCTTGTGTAGTCTT 

CAAACTTGTGATGTTATCTATATAGTAATCACGAGAGAGAATCATACAATAGCTGAAACA 

TAAAGCTTTCTTAGAAGCTTTAAAAAGGTCTCATCTGGATTATCCTGTTTAATTTCTAGA 

GTTTCTTCAGGCAGATTATTAACCGATCAAGAAGACAAACATGAATTCATTTTCCCACG 

CCCTCCGGGTTTTAGATTTCACCCGACAGATGAAGAACTTGTAGACTACTACCTGAGGAA 

AAAAGTCGCATCGAAGAGAATAGAAATTGATTTCATAAAGGACATTGATCTTTAC^ 

TGAGCCATGGGACCTTCAAGAGTTGTGCAAAATTGGGCATGAAGAGCAGAGTGATTGGTA 

CTTCTTTAGCCATAAAGACAAGAAGTATCCCACAGGGACTCGAACCAATAGAGCAACAAA 

AGCAGGGTTTTGGAAAGCCACCGGAAGAGATAAGGCTATCTATTTGAGGCATAGTCTAAT 

TGGCATGAGGAAAACACTTGTGTTTTACAAGGGAAGAGCCCCAAATGGACAAAAGTCTGA 

TTGGATCATGCACGAATACCGCTTAGAAACCGATGAAAACGGAACTCCTCAGGAAGAAGG 

ATGGGTTGTGTGTAGGGTTTTCAAGAAGAGATTGGCTGCAGTTAGACGAATGGGAGATTA 

CGACTCATCCCCTTCACATTGGTACGATGATCAACTTTCTTTTATGGCCTCCGAGCTCGA 

GACAAACGGTCAACGACGGATTCTCCCCAATCATCATCAGCAGCAGCAGCACGAGCACCA 

ACAACATATGCCATATGGCCTCAATGCATCTGCTTACGCTCTCAACAACCCTAACTTGCA 

ATGCAAGCAAGAGCTAGAACTACACTACAACCACCTGCAATCAAATATCGCGCATGAGGA 

ACAATTGAATCAAGGAAATCAGAACTTCAGCTCTCTATACATGAACAGCGGCAACGAGCA 

AGTGATGGACCAAGTCACAGACTGGAGAGTTCTCGATAAATTTGTTGCTTCTCAGCTAAG 

CAACGAGGAGGCTGCCACAGCTTCTGGATCTATACAGAATAATGCCAAGGACACAAGCAA 

TGCTGAGTACCAAGTTGATGAAGAAAAAGATCCGAAAAGGGCTTCAGACATGGGAGAAGA 

ATATACTGCTTCTACTTCTTCGAGTTGTCAGATTGATCTATGGAAGTGAGCTGAAAGAGA 

AGACATATAAATGCATATATACATATATATATATACGTACACACGAACACTAATCAAGTG 

TAGATGATGATGATGGTACAGATTTATATTTGCTTTGATTGATTCTTACTACATTATTGA 

ACTTATGTCATATGCATATATACATTGCGTATCTATGCATATTTATACTTGTACTCAATA 

TGATTAACCATATATAAACTCTAATCTAAATGTAACTCCAATATTTTTTAAATAGACAAT 

TGTCTCTTCTTATTAGAAAAAAAA 

>G761 Amino Acid Sequence (domain in AA coordinates: 10-156) 

MNSFSHVPPGFRFHPTDEELVDYYLR^^ 

EEQSDWYFFSHKDKICYPTGTRTNRATKAGFW^ 

PNGQKSDWIMHEYRLETDENGTPQEEGWVVCRVFKKRLAAVRRMGDYDSSPSHW 

Fl^SELETNGQRRILPOTmQQQQHEHQQHMPYGLNASAYALmTPNLQCKQELELHYNH 

SNIAHEEQLNQGNQNFSSLYIWSGNEQVMDQVTO^ 

NAKDTSNAEYQVDEEKDPKRASDMGEEYTASTSSSCQIDLWK* 

>G1056 (10.. 798) 

GCTACATATATGGGTTCTATTAGAGGAAACATTGAAGAGCCTATATCTCAGTCATTAACG 
AGGCAGAACTCTCTCTATAGCTTAAAGCTCCATGAGGTTCAAACCCACTTAGGAAGTTCT 
GGAAAACCACTAGGAAGCATGAACCTTGATGAGCTTCTCAAGACTGTCTTGCCACCAGCT 
GAGGAAGGGCTTGTTCGTCAGGGAAGCTTGACGTTACCTCGAGATCTCAGTAAAAAGACA 
GTTGATGAGGTCTGGAGAGATATCCAACAGGACAAGAATGGAAACGGTACTAGTACTACT 
ACTACTCATAAGCAGCCTACACTCGGTGAAATAACACTTGAGGATTTGTTGTTGAGAGCT 
GGTGTAGTGACTGAGACAGTAGTCCCTCAAGAAAATGTTGTTAACATAGCTTCAAATGGG 
CAATGGGTTGAGTATCATCATCAGCCTCAACAACAACAAGGGTTTATGACATATCCGGTT 
TGCGAGATGCAAGATATGGTGATGATGGGTGGATTATCGGATACACCACAAGCGCCTGGG 
AGGAAAAGAGTAGCTGGAGAGATTGTGGAGAAGACTGTTGAGAGGAGACAGAAGAGGATG 
ATCAAGAACAGAGAATCTGCAGCACGTTCACGAGCTAGGAAACAGGCTTATACACATGAA 
TTAGAGATCAAGGTTTCAAGGTTAGAAGAAGAAAACGAAAAACTTCGGAGGCTAAAGGAG 
GTGGAGAAGATCCTACCAAGTGAACCACCACCAGATCCTAAGTGGAAGCTCCGGCGAACA 
AACTCTGCTTCTCTCTGATCCTAAAGACTCTTCTTTCTTTCTTCTTCTTTGTGTTGGTTT 
ATATCAGACCGCTTTGTTCTTTGTATATTGTGTAGACTTTATTGACTTTGAACAGCATGT 
CTTTATAAACATTTCTTGAGTGT 
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>G1056 Amino Acid Sequence (domain in AA coordinates 183-24 6) 
MGSIRGNIEEPISQSLTRQNSLYSLKLHEVQTHLGSSGKPLGSMNIiDEIiLKTVLPPAEEG 
LVRQGSLTLPRDLSKKTVDEWRDIQQDKNGNGTSTTTTHKQPTLGEITLEDLLLRAGVV 
TETVVPQENVVNIASNGQWVTSYHHQPQQQ 

VAGEIVEKTVERRQKRMIKNRESAARSRARKQAYTHELEIKVSRLEEENEKLRRLKEVEK 

ILPSEPPPDPKWKLRRTNSASL* 

>G1447 (82. .1086) 

AAAAACCCTAACCCTAATTCTCTCAAGACAACTCAAAGGTCTCTCCTTTTTTAGGTTTAT 
TATCACTTCCGTATAATCGCCATGTCTTCTCTACCATGGAAAAAACCAAAATCGAGTCGA 
ATCTTAAGATTCATTTCTGAGTTTCAACAATCACCGTTCGTTGAAACTGGCTTTCCAACT 
TCTCTGATCGATCTCTTCTTCAAGAATCGCGATCGTCTAAAAAAATCTCCATCTAAACGC 
TTCCAACGAATCGAACGCCAGATTCGAACCGCTCCAAACGCTTCTTCGTTGAGTAATCAA 
GATACGATTTTTGAAAAGCCCTCGAGGATTAAAACCGTTCGAAGTAAGGTCGAGAAAGTT 
AATTGCGTTAAAGGTAAATCAGCGGCGTTGAAGAAGAACGCGATTAAAAATAGCGTTTTC 
GGCGGTAGCGGTGAGGTCGTTTTGATGGCGTTTAAGGTTTTGATAGTAGCGTTGCTCGCC 
TTGAGCACGAAGAAGAAGCTCACTTTAGGAATCACTCTCTCTGCCTTCGCTCTTCTCTTA 
ACAGAGCTCGTGGCGGCGCGTGTTTTCACGCGCTCTAATAACACCGACAAAGACAAAAAC 
GCGATTGCCCGCGAGAAAATCGAAACTTTTGATGAAACTCGAGTTCCCAAAGCGATTCCA 
TGTCCTGAGGAAACAGAGCATGTAGTATCTGAAACAGAGGTTTCGAAGTTGAAAGGTTTA 
ACGATACGTGATCTGTTGTCAAAGGACGAGAAATCAACAAGTAAAAGTTGGAGACTAAAA 
TCGAAGATTGTGAAGAAGTTGAGGAGTTACAATAAGAAGGATAAGAAGACGATGAAGATC 
AAAGAAGAGTCTTTGATTGAAGTCTCGAGTTTGGTTTTAGAAGATAAACCAAAGAAAATT 
GAGTCTGAGAGAGACGAAGAAGAAACGTTGAATCCTCCAGTGGTTGGATCAAACCTGAAT 
GGGATTGTTCTGATCGTGATTGTGCTAACCGGTTTGTTATGTGGGAAGGTCTTAGCTATT 
GTTCTGACACTATCATGTTTGGTTCTTAGATTAGGAGCAGTCAAAAAAGTTAATCTTTGC 
ATATAATTTTTTTTGTATTTTTTAACATGCTTGCATGTGAAACTGTA^ATTTTTCTCATT 
CATATGAAGGAGATTGGATTGAATGTTGAATACTAAA 

>G144 7 Amino Acid Sequence (domain in AA coordinates: 3-54, 124-156) 
MSSIjPWKKPKSSRILRTISEFQQSPFVETGFPTSLIDLFFKNRDRLKKSPSKRFQRIERQ 
I RTAPNAS S LSNQDT I FEKPSRI KTVRS KVEKVNCVKGKS AALKKNAI KNS VFGGSGE W 
LMAFKVXiIVAIjIiALSTKKII^ 

etfdetrvpkaipcpeetehwsetetvsklkgltirdllskdekstskswrlkskivkm 
rsynk2odkk™kikeeslievsslvle^ 

VIiTGLLCGKVLAIVLTLSCLVLRLGAVKKVNLCI* 
>G323 (77.. 826) 

CTGCTCATATCAGCCATTGACACAGTTGCTTTGGGTTTCCCTCAAACGGCGCCGATTGTG 
TGGATTTTGACCACTGATGGCCTTAGATCAATCTTTTGAAGATGCTGCTTTACTTGGAGA 
ACTCTATGGAGAAGGTGCATTTTGTTTCAAGAGCAAGAAACCTGAACCCATTACAGTCTC 
GGTTCCTTCTGATGATACTGATGATTCGAATTTTGACTGCAATATTTGCTTAGACTCGGT 
GGAAGAACCTGTTGTGACTCTCTGTGGTCACCTCTTTTGCTGGCCTTGTATTCACAAATG 
GCTTGATGTACAGAGCTTCTCAACAAGTGATGAATACCAAAGACATAGACAGTGTCCTGT 
TTGTAAATCTAAAGTTTCTCATTCTACTTTGGTTCCTTTGTATGGTAGAGGCCGTTGTAC 
TACTCAGGAGGAAGGTAAi\AACAGTGTGCCTAAAAGACCCGTAGGACCGGTTTATCGGCT 
TGAAATGCCGAATTCACCTTATGCAAGTACTGATCTGCGGTTATCACAACGGGTTCATTT 
CAATAGCCCACAGGAAGGTTACTACCCTGTCTCAGGGGTGATGAGCTCGAACAGTTTATC 
ATACTCTGCTGTTTTGGATCCGGTGATGGTGATGGTTGGAGAAATGGTAGCTACGAGGTT 
GTTTGGAACACGAGTGATGGATAGATTTGCGTATCCGGACACTTACAATCTCGCAGGQAC 
TAGCGGGCCGAGGATGAGAAGGCGGATAATGCAGGCAGATAAATCGCTGGGAAGAATCTT 
CTTCTTCTTTATGTGTTGTGTTGTTCTGTGTCTTCTCTTGTTTTAGGTTTTCATAGCTAG 
CTTGGTTCTGCTACTGTTCAGTTTCTTCAGG 

>G32 3 Amino Acid Sequence (conserved domain in AA coordinates : 4 8-96) 

maldqsfedaallgelygegafcfkskotepitvsvpsddtddsnfdcnicldsvqepvv 

tlcghlfcwpcihkwldvqsfstsdeyqrhrqcpvckskvshstlvplygrgrcttqeeg 

knsvpkrpvgpvyrlempnspyastdlrlsqrvhfnspqegyypvsgvmssnslsysavl 

dpvmvmvgemvatrlfgtrvt^rfaypdtynlag 

cwlclllf* 

>G176 (41.- 1606) 
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AG7UVGAAGAAGAAGAAGAGTACCTCATACGTAAACCATTGATGGGCTCTTTTGATCGCCA 
AAGAGCTGTTCCGAAATTCAAAACAGCAACACCGTCACCGCTCCCTCTTTCTCCTTCGCC 
TTACTTCACTATGCCTCCTGGCCTTACTCCCGCCGACTTTCTCGACTCTCCTCTTCTCTT 
CACTTCCTCCAACATTTTGCCGTCTCCTACGACAGGCACATTTCCAGCGCAATCTCTGAA 
CTATAACAATAACGGTTTGCTCATTGACAAAAATGAAATCAAATATGAAGACACAACTCC 
TCCCTTGTTCCTACCATCTATGGTAACTCAGCCTTTACCTCAACTGGATTTATTCAAATC 
CGAAATCATGTCGAGTAACAAAACCTCTGATGACGGCTACAATTGGCGCAAATACGGGCA 
GAAGCAAGTCAAAGGAAGCGAAAACCCGAGGAGTTACTTCAAATGCACGTATCCAAATTG 
TCTCACAAAGAAGAAAGTAGAGACGTCTCTTGTGAAGGGTCAGATGATTGAGATTGTCTA 
TAAAGGAAGCCACAATCATCCCAAGCCCCAATCCACGAAGCGATCATCTTCCACCGCTAT 
AGCAGCACATCAGAACAGCAGTAATGGAGACGGTAAAGACATTGGTGAAGATGAAACAGA 
GGCCAAGAGATGGAAAAGAGAAGAGAATGTGAAGGAGCCAAGAGTGGTGGTTCAGACAAC 
AAGTGATATAGACATTCTTGACGATGGCTACAGATGGAGAAAGTATGGTCAGAAAGTCGT 
CAAGGGTAATCCAAATCCAAGGAGCTATTACAAGTGCACATTTACAGGATGTTTTGTAAG 
GAAACACGTTGAAAGAGCATTTCAAGATCCCAAGTCAGTGATCACAACTTACGAAGGAAA 
ACACAAACACCAAATCCCGACCCCAAGAAGAGGTCCAGTTTTAAGATCTGCTGCAATGGC 
TTCTCCTCTTCTCCCAACTTCGACTACTCCTGATCAACTTCCCGGCGGCGATCCACAGTT 
GCTGAGCTCTCTACGCGTCCTCTTGTCCCGCGTTCTAGCCACCGTCCGTCACGCTTCTGC 
AGATGCCAGACCCTGGGCAGAGCTCGTTGACCGGTCAGCGTTTTCCCGGCCACCATCGCT 
CTCGGAGGCAACGTCACGAGTAAGGAAGAACTTTTCCTATTTCCGAGCCAATTACATAAC 
CTTAGTGGCAATCTTACTCGCCGCGTCTCTGCTCACGCACCCTTTCGCTCTCTTCCTCCT 
CGCATCGCTGGCCGCTTCTTGGCTTTTCCTCTACTTTTTCCGTCCGGCGGATCAGCCGTT 
GGTCATTGGAGGACGCACGTTCTCCGATCTTGAGACGCTAGGGATACTCTGCCTGTCCAC 
TGTGGTGGTGATGTTCATGACCAGCGTTGGATCGCTCTTGATGTCCACTCTAGCAGTTGG 
GATCATGGGCGTGGCCATCCACGGAGCGTTTCGTGCTCCCGAAGACCTGTTTCTTGAAGA 
ACAAGAAGCCATTGGATCTGGACTTTTCGCATTCTTCAACAACAATGCCTCTAATGCAGC 
TGCCGCTGCCATAGCCACCTCAGCAATGTCACGCGTTCGAGTCTGAGATTGTTGAAGAGA 
CTACATTCCTACACCGCATTTCCAAAGTGTGATATTTATTCATATTGAATTGTT 

>G176 Amino Acid Sequence (domain in AA coordinates: 117-173,2 34-290) 

MGSFDRQRAVPKFKTATPSPLPLSPSPYFTMPPGLTPADFLDSPLLFTSSNILPSPTTGT 

FPAQSLimvINNGL^ 

NWRKYGQKQVKGSENPRSYFKCTYPNCLTKKKVETSLVKGQMIEIVYKGSHNHPKPQSTK 
RS S STAI AAHQNS SNGDGKD I GEDETEAKRWKREENVKEPRWVQTTSD ID I LDDGYRWR 
KYGQKVVKGNPNPRSYYKCTFTGCFVRKEfTERAFQDPKSVITTYEGKHKHQIPTPRRGPV 
LRSAAI^SPLLPTSTTPDQLPGGDPQLLSSLRVLLSRVI^^ 
FSRPPSLSEATSRVRICNFSYFRAlTCITLVAIIiIiAASLLTHPF^ 

RPADQPLVIGGRTFSDLETLGILCLSTVVVMFMTSVGSLLMSTIiAVGIMGVAIHGAFRAP 

EDLFLEEQEAIGSGLFAFFNNNASNAAAAAIATSAMSRVRV* 

>G174 (194. .1585) 

CCCAATTTGAGATTGTTCGATTTCGATCTACGAGATTCTTACAAGAACATAAGCAGCTTC 
GGTTTTTTGGGATTATCTTATTTGGTCGGATGATGATCTTCTCGATGTCTGTGCTAGGCT 
TTGGGAATTAGATATATTTGGGGTTAAGCTCGAGTCTCTCCGGTTTTGAGTTTACTTGAG 
TTTGTTAGTATTTATGGCTGAGGTGGGAAAAGTTCTGGCTAGTGATATGGAGTTAGACCA 
TTCAAATGAGACTAAAGCAGTGGATGATGTTGTTGCCACTACTGATAAAGCGGAGGTCAT 
ACCAGTGGCTGTAACTAGAACTGAAACCGTTGTTGAAAGTTTGGAATCTACTGACTGTAA 
GGAGCTTGAAAAACTTGTTCCACATACGGTAGCTTCGCAGTCGGAAGTAGATGTTGCTTC 
CCCGGTATCCGAGAAAGCAGCGAAGGTTTCTGAAAGTAGCGGTGCATTATCTTTGCAGTC 
TGGTTCGGAAGGGAATAGTCCTTTTATTCGTGAGAAGGTTATGGAAGACGGATACAACTG 
GCGGAAATATGGACAGAAACTTGTGAAAGGAAATGAGTTTGTAAGGAGCTATTACAGGTG 
CACTCACCCTAACTGCAAAGCGAAAAAACAGTTGGAACGGTCTGCGGGTGGACAAGTCGT 
GGATACCGTTTACTTTGGGGAACATGATCACCC7yVAGCCTCTTGCTGGTGCTGTTCCTAT 
CAATCAGGATAAGCGAAGTGATGTCTTCACAGCTGTTAGTAAAGAGAAAACATCTGGATC 
CAGTGTTCAGACACTTCGTCAAACCGAACCACCAAAGATCCATGGAGGATTACATGTTTC 
AGTTATTCCACCAGCTGATGATGTGAAAACTGATATTTCACAATCAAGTAGGATAACGGG 
GGACAACACTCACAAGGATTATAATAGTCCTACCGCAAAGCGAAGGAAGAAAGGAGGGAA 
CATTGAGCTGAGTCCAGTGGAGAGGTCAACCAATGATTCACGCATTGTGGTTCACACTCA 
GACTCTGTTTGATATTGTGAATGATGGGTACCGATGGCGTAAATATGGTCAGAAATCAGT 
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AAAAGGCAGCCCATATCCAAGGAGCTACTATAGATGTTCAAGCCCTGGATGCCCCGTCAA 
GAAACACGTAGAGAGGTCATCTCATGACACAAAGTTGCTTATAACAACTTACGAGGGAT^A 
ACACGACCACGATATGCCTCCAGGAAGAGTTGTTACTCATAATAACATGCTGGACTCGGA 
AGTTGATGATAAAGAAGGAGATGCCAACAAGACTCCACAGAGCTCAACTCTTCAATCCAT 
TACAT^AAGACCAGCATGTCGAAGATCACTTAAGAAAGAAAACGAAGACTAATGGCTTTGA 
GAAAAGTCTTGATCAAGGTCCAGTTTTGGATGAGAAGCTGAAGGAGGAAATAAAAGAGAG 
ATCAGATGCAAACAAAGATCACGCAGCCAATCACGCCAAGCCGGAAGCAAAGTCAGATGA 
TAAAACCACTGTTTGTCAAGAGAAGGCAGTAGGAACCCTGGAGAGCGAGGAACAAAAACC 
CAAGACAGAGCCTGCCCAAAGCTAAGCATTCAGTGTTGTACCGAGTGGTAATTTATATGG 
CTGTTTTAACATAGATTAGTACAGGCGATATGGTTATAGACTGTACAGTTGTTGTTCAGG 
CGGGACCAGATTTAGATTAGTGTTTAATGGAATAGTATGCTTTAATACCTTTATGTAACC 
ACTTCCATTTGGTTCAAATAAGAGTTACAGGAAGAGAAGGTAACACAACAAGAGCCCTTC 
TTTGTTGATGGAGCCTGTGTAATAGTTGTAGCATGGGGATGTATATGATTTGATTCAACC 
TTATTAATGGTTATGAGACAAAACTATC 

>G174 Amino Acid Sequence (domain in AA coordinates: TBD) 

MAEVGKVIiASDMELDHSNETKAVDDWATTO 

LVPHTVASQSEVDVASPVSEKAPKVSESSGALSLQSGSEG 

QKIiVKGNEFVRSYYRCTHPNCKAKKQLERSAGGQVVDTVYFGEHDHPKPIjAGAVPINQDK 
RSDVFTAVSKEKTSGSSVQTLRQTEPPKIHGGLHVSVIPPADDVKTDISQSSRITGDNTH 
KDYNSPTAKRRKKGGNIELSPVERSTNDSRIVVHTQT^ 
YPRSYYRCSSPGCPVKIOIVERSSHDTKIjLITTY^ 

EGDANKTPQSSTLQS ITKDQHVEDHLRKKTKTNGFEKSLDQGPVLDEKLKBE I KERSDAN 
KDHAANHAKPEAKSDDKTTVCQEKAVGTLESEEQKPKTEPAQS * 
>G715 (1..705) 

ATGGATACCAACAACCAGCAACCACCTCCCTCCGCCGCCGGAATCCCTCCTCCACCACCT 

GGAACCACCATCTCCGCCGCAGGAGGAGGAGCTTCTTACCACCACCTTCTCCAACAACAA 

CAACAACAGCTCCAACTATTCTGGACCTACCAACGCCAAGAGATCGAACAAGTTAACGAT 

TTCAAAAACCATCAGCTTCCACTAGCTAGGATAAAAAAGATCATGAAAGCCGATGAAGAT 

GTTCGTATGATCTCCGCAGAAGCACCGATTCTCTTCGCGAAAGCTTGTGAGCTTTTCATT 

CTCGAGCTCACGATCAGATCTTGGCTTCACGCTGAGGAGAATAAACGTCGTACGCTTCAG 

AAAAACGATATCGCTGCTGCGATTACTAGGACTGATATCTTCGATTTCCTTGTTGATATT 

GTTCCTAGAGATGAGATTAAGGACGAAGCCGCAGTCCTCGGTGGTGGAATGGTGGTGGCT 

CCTACCGCGAGCGGCGTGCCTTACTATTATCCGCCGATGGGACAACCAGCTGGTCCTGGA 

GGGATGATGATTGGGAGACCAGCTATGGATCCGAATGGTGTTTATGTCCAGCCTCCGTCT 

CAGGCGTGGCAGAGTGTTTGGCAGACTTCGACGGGGACGGGAGATGATGTCTCTTATGGT 

AGTGGTGGAAGTTCCGGTCAAGGGAATCTCGACGGCCAAGGGTAA 

>G715 Amino Acid Sequence (domain in AA coordinates: 60-132) 

MDTNWQQPPPS AAGI PPPPPGTTI S AAGGGAS YHHLLQQQQQQLQLFWTYQRQE I EQVND 

FKiraQLPLARIKKIMKADEDVRMISAEAPILFAKACELFILELTIRSWLHM 

KOT)IAAAITRTDIFDFLVDIVPRDEIKDEAAVLGGGMWAPTASGWYYTPPMGQPA 

GMMIGRPAMDPNGVYVQPPSQAWQSVWQTSTGTGDDVSYGSGGSSGQGNLDGQG* 

>G588 (196.. 1599) 

ATCTGAAGTGAACCAAGCTCAGGTTTTGTCTTCTCTTTGATCATTCCTTTCTCAGCAATA 
TAAATTAGAGTTATATCCTTTATAAAGGATTTTGCTTTTTCACCAACAAACCCTAAATTC 
GGTGTCTCAGCAAGAATCACGTGATTCTCGTTCCTCTTCCTCACGAAACCCATCATCTTC 
TATCTCATTTGAGAAATGGGTCAAAAGTTTTGGGAGAATCAAGAAGATCGAGCGATGGTT 
GAATCCACCATAGGCTCTGAAGCTTGCGACTTTTTCATCTCAACAGCTTCAGCTTCCAAC 
ACTGCCTTGTCCAAGCTTGTCTCACCACCAAGTGATTCCAATCTCCAACAAGGGTTACGT 
CACGTTGTTGAAGGATCTGATTGGGATTATGCTCTTTTCTGGCTAGCGTCCAACGTTAAT 
AGCTCTGATGGTTGTGTCTTGATCTGGGGAGATGGTCATTGCCGTGTCAAAAAGGGTGCT 
TCAGGTGAGGATTACTCTCAGCAAGATGAGATCAAAAGACGTGTGCTTCGCAAGCTTCAC 
TTGTCGTTCGTTGGTTCAGATGAAGATCATCGTTTGGTGAAATCAGGAGCTCTTACTGAT 
CTCGACATGTTTTATCTGGCTTCTTTGTACTTTTCCTTTAGGTGTGATACCAATAAGTAC 
GGTCCTGCTGGAACCTATGTGTCTGGGAAGCCTCTTTGGGCTGCAGATTTGCCTAGCTGC 
TTGAGTTATTATAGGGTTAGGTCTTTCTTAGCTAGGTCAGCTGGTTTTCAGACTGTGTTG 
TCTGTAC CAGTG AATTCTGGAGTTGTGGAGCTTGGTTCTTTAAGACATATTC CAGAAGAT 
AAGAGTGTGATTGAGATGGTGAAATCAGTGTTTGGTGGGTCTGACTTTGTTCAGGCTAAA 



235 



WO 03/013227 PCT7US02/25805 

236/286 



GAAGCTCCTAAAATCTTTGGTCGACAGCTGAGTCTTGGTGGAGCAAAACCTCGGTCTATG 
AGTATTAATTTCTCCCCGAAGACCGAGGATGACACGGGTTTCTCATTGGAATCGTATGAG 
GTGCAAGCGATCGGAGGCTCTAATCAAGTGTATGGTTATGAGCAAGGGAAAGATGAGACA 
TTGTATCTAACTGACGAGCAAAAGCCGAGGAAGAGAGGGAGAAAACCAGCAAATGGAAGA 
GAAGAGGCTCTAAACCATGTGGAAGCGGAACGGCAGAGGAGGGAGAAGCTGAACCAGAGA 
TTCTACGCTTTGAGAGCGGTGGTGCCTAACATCTCCAAGATGGACAAGGCTTCGCTCCTT 
GCAGACGCAATCACTTACATCACGGATATGCAGAAGAAAATCAGGGTGTATGAAACAGAG 
AAGCAGATAATGAAGAGGAGGGAGAGTAATCAGATAACTCCAGCAGAGGTTGATTATCAA 
CAGAGGCATGATGATGCAGTTGTAAGGCTAAGCTGTCCGTTGGAAACTCATCCAGTTTCA 
AAGGTGATACAAACGTTGAGGGAGAATGAAGTTATGCCTCATGATTCCAACGTGGCCATC 
ACAGAGGAGGGTGTGGTTCACACATTCACTCTCCGGCCTCAGGGTGGCTGCACCGCTGAG 
CAGTTGAAGGACAAGCTCCTTGCCTCTCTATCACAGTAACTATCACAGCAGTAACTGCTA 
TGTAATAAGTGTAACCGTGTTGGAGGTTGTATCAATGTACTATTGCAAGCCAACCAAAAA 
AAACTCCAGCTTAGTAGGATCGTGTAATTTTCCTTATATGTAATGTTGAGATTTGTCTTT 
TACATATAAAGATTTGA 

>G588 Amino Acid Sequence (domain in AA coordinates: 309-376) 

MGQK^WENQEDRAMVESTlGSEACDFFISTASASNTALSKLVSPPSDSNIiQQGLRHVVEG 

SDWDYAIiFWLASlsrVNSSDGC\njIWGDGHCR 

SDEDHRIjVKSGALTDLDMFYLASLYFSFRCDTNKyGPAGTYVSGKPLWAADLPSCIjS 

VRSFI»ARSAGFQTVLSVPWSGWELGSLRHIPEDKSVIEMVKSVFGGSDFVQAKEAPKI 

FGRQLSLGGAKPRSMSINFSPKTEDDTGFSLESYEVQAIGGSNQVYGYEQGKDETLYIiTD 

EQKPRKRGRKPANGREEALNHVEAERQRREKLNQRF YALRAVVPNI S KMDKASLLADAI T 

YITDMQKKIRVYETEKQIMKRRESNQITPAEVDYQQRHDDAVVRLSCPLETHPVSKVIQT 

LRENEVMPHDSNVAITEEGVVHTFTLRPQGGCTAEQIiKDKLLASLSQ* 

>G1758 (69.-677) 

GTCCCTCCTCTTAGCTTCAACCGCCGGAAAAACTAAACAACCTTCTTGGAAAAAAAGAGA 

AACTAAAAATGAACTATCCTTCAAACCCTAACCCTAGCTCCACAGATTTCACTGAATTTT 

TCAAGTTCGATGATTTTGACGATACTTTTGAGAAGATCATGGAAGAAATCGGCCGTGAGG 

ACCACTCGTCGTCACCGACTTTGAGTTGGAGTTCATCGGAAAAGTTAGTGGCTGCAGAAA 

TCACAAGCCCGCTTCAAACAAGCCTAGCTACCTCACCTATGAGCTTTGAAATAGGTGACA 

AAGATGAAATCAAAAAGAGGAAGAGACACAAAGAAGATCCGATTATTCACGTCTTCAAAA 

CGAAATCATCAATTGATGAAAAGGTTGCTTTAGATGATGGGTATAAATGGAGGAAATACG 

GAAAGAAGCCGATAACGGGTAGTCCATTTCCAAGGCATTATCACAAGTGTTCGAGCCCAG 

ATTGCAACGTGAAGAAGAAGATCGAAAGAGATACGAACAATCCAGATTACATATTGACAA 

CATACGAAGGTAGACATAACCACCCAAGCCCTTCTGTAGTTTATTGTGATTCAGACGACT 

TTGATCTTAACTCTCTCAACAATTGGTCCTTTCAGACGGCAAATACGTATAGTTTCTCTC 

ATTCTGCTCCATATTGATCGATCGTAGTTACAAGTTTGTGTATATAGATGTATATATATA 

TATCACCAATTCACCATCGTAATCACGTCTCACATGTAACTACGTACATATATCTTGTTC 

GGGGTTCGTTTTGTAATGTATTGAATTGGTGGAGGTAGAATGGAAGTCATCTTGTATAGT 

TGTACTTGTATGTAAGGTTTGATAGTCATTTTTTATAAAGTAACTAATTTGTACAA 

>G1758 Amino Acid Sequence (domain in AA coordinates: TBD) 

MNYPSNPNPSSTDFTEFFKFDDFDDTFEKIMEEIGREDHSSSPTLSWSSSEKLVAAEITS 

PLQTSLATSPMSFEIGDKDEIKKRKRHKEDPIIHVFKTKSSIDEKVALDDGYKW^ 

PITGSPFPRHYHKCSSPDCNVKKKIERDTNNPDYILTTYEGRHNHPSPSVVYCDSDDFDIj 

NSLNNWSFQTANTYSFSHSAPY* 

>G2148 (66.. 73'7) 

GTCTCTAATATAAGCTTGAACGTTGCTATATATAAATGTAAAGGCGAACGCATAAGAAAA 
GAAAAATGGAGAATGAAGCTTTTGTAGATGGTGAATTGGAGTCTCTTTTGGGGATGTTCA 
ACTTTGATCAATGTTCATCTAACGAATCGAGCTTTTGCAATGCTCCAAATGAGACTGATG 
TTTTCTCTTCTGATGATTTCTTCCCATTTGGTACAATTCTGCAAAGTAACTATGCGGCCG 
TTCTTGATGGTTCCAACCACCAAACGAACCGAAATGTCGACTCAAGACAAGATCTGTTGA 
AACCAAGGAAGAAGCAAAAGTTAAGCTCGGAAAGCAATTTGGTTACCGAGCCTAAGACTG 
CTTGGAGAGATGGTCAAAGCCTAAGCAGTTATAATAGTTCAGATGATGAAAAGGCTTTAG 
GTTTAGTGTCTAATACATCAAAAAGCCTAAAACGCAAAGCGAAAGCCAACAGAGGGATAG 
CTTCCGATCCTCAGAGCCTATACGCTAGGAAACGAAGAGAAAGGATAAACGATAGGCTAA 
AGACATTGCAGAGCCTAGTTCCTAATGGGACAAAGGTCGATATAAGCACAATGCTGGAAG 
ATGCTGTCCATTACGTGAAGTTCCTGCAGCTTCAAATC71AGCTCTTGAGTTCAGAAGATC 



236 



WO 03/013227 



237/286 



PCT/US02/25805 



TATGGATGTATGCACCTCTTGCTCACAATGGTCTGAATATGGGACTACATCACAATCTTT 
TGTCTCGGCTTATTTAAGACAAAATCATTGGAATAACATAACTTACAGTACTTGTTTTTT 
TTCTCGTTCTATATTCATGATTATGGTTATTTTTTGTTTGAGTTGTTCAATTTTTCTGTC 
TATTGCGTTCTATGAACTTGACACTCTTTTTGTAATTATTATATGCTAAAGACAATTTGG 
ACTAACAGCATTTTAATAAAAAAAAAAAA 

>G2148 Amino Acid Sequence (conserved domain in AA coordinates : 130-268) 

MENEAFVDGELESLLGMFNFDQCSSNESSFCNAPNETDVFSSDDFFPFGTIIjQSNYAAVIi 

DGSNHQTNRNVDSRQDLLKPRKKQK^ 

VSNTSKSLKRKAKA^GIASDPQSLYARKRRERINra^ 

VHYVKFLQLQIKLLSSEDLWMYAPLAHNGLl^GLHHNIiLSRLI * 

>G2379 (52.. 798) 

CGCCGTCACTCTCCTCCCGGTGCCGCACATTAGCAACACTACTCCCGACGAATGGAGACG 
ACGACGCCGCAGTCAAAATCAAGTGTGTCCCACCGACCGCCGTTGGGAAGAGAAGACTGG 
TGGAGTGAGGAAGCGACGGCGACGCTGGTAGAAGCCTGGGGCAATCGTTACGTCAAGCTG 
AACCACGGAAATCTCCGGCAGAATGACTGGAAAGACGTCGCCGACGCCGTTAACTCTAGA 
CACGGTGATAACAGCCGTAAGAAGACCGACTTACAGTGTAAGAACCGGGTCGATACTTTG 
AAGAAGAAGTACAAAACAGAGAAAGCTAAACTCTCGCCGTCGACTTGGCGTTTCTATAAC 
CGCCTCGATGTTCTAATCGGTCCCGTTGTGAAGAAATCGGCTGGCGGAGTTGTCAAATCA 
GCGCCTTTTAAGAATCATCTGAATCCAACTGGATCGAACTCTACTGGAAGCTCTCTTGAA 
GATGATGATGAGGATGATGATGAGGTTGGTGATTGGGAATTCGTTGCTAGGAAGCATCCT 
CGTGTGGAAGAGGTAGATCTGAGTGAAGGATCAACGTGTAGGGAACTAGCTACGGCGATT 
CTCAAGTTTGGAGAAGTTTACGAGAGAATTGAAGGGAAGAAGCAACAGATGATGATTGAG 
TTGGAGAAGCAGAGAATGGAAGTGACAAAGGAGGTAGAGTTAAAACGAATGAACATGTTG 
ATGGAGATGCAGTTAGAGATTGAGAAATCAAAGCACCGGAAACGCGCAAGTGCTTCAGGT 
AAGAAGAACTCACATTAGG 

>G2379 Amino Acid Sequence (domain in AA coordinates : 19-110, 173-232) 

METTTPQSKSSVSHRPPLGREDWWSEEATATLVEAW 

NSRHGDNSRKKTDLQCKNRVDTLKKK^CT 

VKSAPFKmLNPTGSNSTGSSLEDDDEDDDEVGDV^ 

TAILKFGEVYERIEGKKQQMMXELEKQRMEVTK^^ 

ASGKKNSH* 

>G1462 (63.. 1031) 

CGTCGACCATTCTTGCGATTGATCTTTCTCTAGAT^TTTTTTTGATCGATTTAGTTTCA 

TTATGGAGGACGACGACGCAGCTTATGATCTAATCAAACACGAACTGTTATACTCAGAAG 

ACGAAGTAATAATCTCACGTTATCTGAAGGGTATGGTCGTTAACGGAGATTCTTGGCCAG 

ATCACTTCATCGAAGACGCAAACGTGTTCACCAAGAATCCAGATAAGGTGTTCAATTCTG 

AGAGACCTAGATTCGTGATCGTTAAACCACGAACAGAGGCTTGTGGTAAAACCGATGGAT 

GTGATTCGGGTTGCTGGAGGATCATTGGTCGTGATAAACTGATAAAGTCGGAGGAGACTG 

GGAAGATTCTAGGGTTCAAGAAGATACTCAAGTTTTGCCTAAAGAGGAAACCTATAGACT 

ACAAGAGAAGTTGGGTAATGGAAGAGTATAGGCTTACCAATAACTTGAACTGGAAGCAAG 

ATCATGTGATTTGCAAAATTCGGTTTATGTTTGAAGCTGAAATTAGTTTCTTGCTAAGCA 

AGCATTTCTACACTACATCAGAATCGGTTCTTGAAAATGAGCTGTTGCCATCTTATGGAT 

ATTATTTATCCAATACACAAGAGGAGGATGAATTTTATCTGGACGCGATAATGACTTCGG 

AAGGAAACGAGTGGCCTAGCTACGTTACCAACAACGTGTACTGTCTGCATCCATTGGAGC 

TTGTGGATCTTCAAGATCGGATGTTTAATGATTACGGAACCTGCATCTTCGCTAACAAGA 

CTTGTGGTGAAACTGATAAATGCGATGGTGGTTACTGGAAGATCCTGCACGGTGATAAGC 

TGATCAAGTCAAATTTCGGAAAGGTCATTGGTTTCAAGAAGGTATTTGAGTTCTATGAAA 

CGGTGAGACAAATATATCTTTGTGATGGAGAAGAAGTGACGGTAACTTGGACTATACAAG 

AGTATAGGCTTAGCAAAAACGTGAAGCAGAATAAAGTGTTGTGCGTTATCAAGTTGACTT 

ATGATAGATAGGATACTTTACTTTGGTTTTTGTGATCATCTTAGTATCTTACGAATATTC 

TAGATACACACATCTATAGGCGACCGCTCTAGACAGGCCTCGTACCG 

>G1462 Amino Acid Sequence (domain in AA coordinates: TBD) 

MEDDDAAYDL I KHELL YS EDE VI I SR YL KGMWNGDS WPDHF IED ANVFTKNPDKVFNSE 

RPRFVIVKPRTEACGKTDGCDSGCWRIIGRDKLIKSEETGKILGFKKILKFCLKRKPIDY 

KRSWVMEEYRLTimLNWKQDHVIC^ 

YLSNTQEEDEFYLDAIMTSEGNEWPSYVTNNVYCLHPIjEL^ 

CGETDKCDGGYWKILHGDKLIKSNFGKVIGFKKVFEFYETVRQIYXjCDGEEVTVTWTIQE 
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YRLSKNVKQNKVIjCVIKLTYDR* 
>G1211 (44 . .1120) 

TGAAACCTAGATTTCTGCAACTGAATTCCTAATTCGAAAAAGAATGGAGGGTTCGTCGTC 
GACGATAGCAAGGAAGACATGGGAACTAGAGAACAGCATTCTAACAGTAGACTCACCTGA 
TTCAACCTCCGACAACATCTTCTACTACGACGATACTTCACAGACTAGGTTCCAGCAAGA 
Gi\AACCGTGGGAGAATGATCCTCACTACTTTAAACGAGTCAAGATCTCAGCGCTCGCTCT 
TCTTAAGATGGTGGTrCACGCTCGCTCTGGTGGTACAATTGAAATAATGGGTCTTATGCA 
AGGTAAGACCGATGGTGATACTATCATTGTTATGGATGCTTTTGCTTTACCAGTGGAAGG 
TACTGAGACAAGGGTTAATGCTCAGGATGATGCTTATGAGTACATGGTTGAGTATTCACA 
GACCAACAAGCTCGCGGGGCGGCTGGAGAATGTTGTTGGATGGTATCACTCTCACCCTGG 
ATATGGATGCTGGCTCTCCGGTATTGATGTTTCTACGCAGACGCTTAACCAACAGCATCA 
GGAGCCATTTTTAGCTGTTGTTATTGATCCCACAAGGACTGTTTCAGCTGGTAAGGTTGA 
GATTGGTGCTTTCAGAACATACTCTAAAGGATATAAGCCTCCAGATGAACCTGTTTCTGA 
GTATCAiU^CTATTCCTTTAAATAAGATTGAGGACTTTGGTGTTCACTGCAAACAGTACTA 
TTCATTAGATGTCACTTATTTCAAGTCATCTCTTGATTCTCACCTTCTGGATCTACTATG 
GAACAAGTACTGGGTGAACACTCTTTCTTCTTCTCCACTGCTGGGTAATGGAGACTATGT 
TGCTGGACAAATATCAGACTTAGCTGAGAAGCTTGAGCAAGCCGAGAGTCATCTGGTTCA 
GTCTCGCTTTGGAGGAGTTGTGCCATCATCCCTTCATAAGAAAAAAGAAGATGAGTCTCA 
ACTAACTAAGATAACTCGGGATAGCGCAAAGATAACTGTGGAACAGGTCCATGGACTAAT 
GTCGCAGGTCATAAAAGATGAATTATTCAACTC^^ 

CACTGACTCGTCGGATCCAGACCCTATGATTACATATTGAAGTTGCTCTTCTTTTGGTTT 

CTANTTTTGGATTGACCCATCATTTGTTGTCCTTTCATTTATTTTCTGTTGTGTAAAGAA 

TTATAATGNCGNCGCGAATTCGCGGCCGCTAAAAAAANACAGGAAATTGAAAANAATTCN 

NCCATTCCAACATCTTTATTTAATATTATCTCCTCNATTATATAATATTCAAACATCCCT 

ANTANCTTCATTTGACGGTCCCCCTCCCTCCCGTGTTGCNTTGGTGCTGGCCCC 

>G1211 Amino Acid Sequence (domain in AA coordinates: 123-179) 

MEGSSSTIARKTWELENSILTVDSPDSTSDN^ 

ISAIiAIiLKMWHARSGGTIE IMGLMQGKTDGDTI IVMDAFALPVEGTETRVNAQDDAYEY 
MVEYSQTNK^GRLENWG 

SAGKVEIGAFRTYSKGYKPPDEPVSEYQTIPLNKIEDFGVHCKQYYSLDVTYFKSSriDSH 

LLDLLWNKYWVNTLSSSPL^ 

KEDESQLTKITRDSAKITVEQVHGL^ 

>G1048 (5.. 892) 

GACCATGGCGGAGGAATTTGGAAGCATAGATTTACTCGGAGATGAAGATTTCTTCTTCGA 
TTTCGATCCTTCAATCGTAATTGATTCTCTTCCGGCGGAGGATTTTCTTCAGTCTTCACC 
GGATTCATGGATCGGAGAAATCGAGAATCAATTGATGAACGATGAGAATCATCAAGAGGA 
GAGTTTTGTGGAATTGGATCAGCAATCGGTTTCAGATTTCATAGCGGATCTACTCGTTGA 
TTATCCAACTAGCGATTCTGGCTCCGTTGATTTGGCGGCTGATAAAGTTCTAACCGTCGA 
TTCTCCCGCCGCCGCTGATGATTCCGGGAAGGAGAATTCGGATTTGGTTGTTGAGAAGAA 
GTCTAATGATTCTGGTAGCGAGATTCATGATGATGATGACGAAGAAGGAGACGATGATGC 
TGTGGCTAAAAAACGAAGAAGGAGAGTAAGAAATAGAGATGCGGCGGTTAGATCGAGAGA 
GAGGAAGAAGGAATATGTACAAGATTTAGAGAAGAAGAGTAAGTATCTCGAAAGAGAATG 
CTTGAGACTAGGACGTATGCTTGAGTGCTTCGTTGCTGAAAACCAGTCTCTACGTTACTG 
TTTGCAAAAGGGTAATGGCAATAATACTACCATGATGTCGAAGCAGGAGTCTGCTGTGCT 
CTTGTTGGAATCCCTGCTGTTGGGTTCCCTGCTTTGGCTTCTGGGAGTAAACTTCATTTG 
CCTATTCCCTTATATGTCCCACACAAAGTGTTGCCTCCTACGTCCAGAACCAGAAAAGCT 
GGTTCTAAACGGGCTCGGGAGTAGTAGCAAACCGTCTTATACCGGCGTTAGTCGGAGATG 
TAAGGGTTCGAGGC€TAGGATGAAATACCAAATCTTAACCCTTGCGGCGTGACAACGCCT 
TTTTTAACTGCTTCTTTTGCGCATTTTGAGTTGTAGATGAGTGTCTTTTAGTTTTCTCTC 
TCTTGTTTTGTATTTCGCTGTTGAAAGTTTTCTGTCTAATATCGATAAGTTAACAGTGAA 
AAAAAAAAAAAAAAA 

>G104 8 Amino Acid Sequence (domain in AA coordinates 13 8-190) 
MAEEFGSIDLLGDEDFFFDFDPSIVIDSLPAEDFLQSSPDSWIGEIENQLMNDENHQEES 
FVELDQQSVSDFIADLLVDYPTSDSGSVDLAADK^ 
1STDSGSEIHDDDDEEGDDDAVAKICRRRRVRNRDA 

RLGRMLECFVAENQSLRYCIjQKGNGI^TTMMSKQESAVLLLESIjIiLGSLIjWLLGVNFICL 
FPYMSHTKCCIiLRPEPEKLVLNGLGSSSKPSYTGVSRRCKGSRPRMKYQIIjTLAA* 
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>G986 (31 . . 846) 

CATTAAATTGGCTCCTGTGAACCTAAATTTATGGACTATGATCCCAACACCAATCCGTTC 
GACCTTCATTTCTCCGGTAAACTTCCGAAAAGAGAAGTCTCGGCTTCAGCTTCTAAAGTT 
GTAGAGAAGAAATGGTTAGTGAAAGATGAGAAGAGAAATATGCTACAAGATGAAATAAAC 
CGGGTTAATTCGGAGAACAAGAAGCTAACCGAAATGTTAGCAAGAGTCTGTGAGAAGTAC 
TATGCTCTTAATAATCTTATGGAGGAGTTGCAGAGTCGAAAGAGTCCTGAAAGTGTTAAC 
' TTTCAGAACAAACAGCTAACGGGGAAACGAA7^ACAAGAACTTGATGAGTTTGTTAGCTCC 
CCAATTGGACTCAGTCTCGGACCAATCGAGAACATCACCAACGATAAAGCGACGGTTTCA 
ACCGCTTACTTTGCTGCTGAGAAGTCTGACACAAGCTTGACTGTGAAAGATGGATATCAA 
TGGAGGAAATACGGGCAAAAGATTACGAGAGATAATCCATCTCCTAGAGCTTACTTCAGA 
TGCTCGTTTTCACCGTCTTGTCTAGTCAAGAAGAAGGTGCAACGAAGTGCAGAAGATCCA 
TCTTTCTTGGTAGCCACTTACGAAGGGACACATAACCACACCGGACCACATGCAAGTGTG 
TCCAGGACAGTGAAACTTGATCTAGTTCAAGGTGGGCTTGAACCAGTTGAGGAAAAGAAA 
GAGAGAGGGACGATTC^UVGAGGTTrTGGTGCAACAAATGGCTTCTTCGTTGACCAAAGAT 
CCTAAGTTCACTGC^GCTCTTGCGACTGCTATTTCCGGGAGATTGATAGAGCATTCAAGA 
AGATGAAAGTTCTCTAGAACATGTATATTTCTGTTTTGTTCTATTTTGTTGCTCATTCCT 
AGTAAAAAGGTAAAGATTTGTTTGATCTTGATTAGGAGGCATAGATGTCAATTTTAATGT 
GTGTGTATATAATTACATCAAATCTAAGTATCCAAAAAGGGTCACCCCCATTTTATCTTA 
TG 

>G986 Amino Acid Sequence •(domain in AA coordinates: 146-2 03) 
MDYDPNTNPFDLHFSGKLPKREVSASASKVVEKKWLVK^ 
EMLARVCEK^TYALNNIjMEELQSRKSPESVNFQ 
NITNDKATVSTAYFAAEKSDTSLTVKDGYQWRCT^ 

KKVQRSAEDPSFLVATYEGTHimTGPHASVSRTVKLDLVQGGLEPVEEKKERG 

QQMASSL.TKDPKFTAALATAISGRLIEHSRT* 

>G789 (259.. 1593) 

GGCAAGAAGAACCTTAGCCTCTCTTTCTTCTTTCTCTCTCTCTCTCTCTGTGGTACTGTT 
CTGTTTCAACTTTACTCCCTCAGTTTCAGAACAATTCCCTATCTAGAAGAGAGATAAAAC 
CGAGAAGGTTTTGGAGATAGAATCTTTTGTTCTTCTTTTGTCCCTCCTTGCTCGATTTTT 
GTTACGTGTGAAGGAATAAAAAAAAACTGATATAGCTAAATCTTCCATCCATTCA.GAGGC 
TTCTAAATCTGATCTGACATGGAACAAGTGTTTGCTGATTGGAATTTTGAAGATAATTTT 
CACATGTCCACTAATAAAAGATCAATCAGACCAGAAGATGAATTAGTGGAGCTATTGTGG 
AGAGATGGTCAAGTGGTTTTACAAAGCCAAGCTCGTAGAGAACCGTCAGTCCAAGTCCAA 
ACCCACAAACAAGAAACCCTAAGAAAACCCAAC^ 

GTACAAAAGCCTAACTACGCTGCTCTAGATGATCAAGAAACCGTCTCCTGGATACAATAC 
CCTCCGGATGACGTCATCGACCCTTTCGAATCCGAGTTCTCCTCTCATTTCTTCTCTTCG 
ATCGATCACCTCGGAGGTCCTGAGAAGCCACGAACGATCGAAGAGACAGTTAAGCATGAG 
GCTCAAGCCATGGCTCCTCCTAAGTTTAGATCCTCGGTTATAACAGTCGGACCGAGTCAT 
TGCGGCAGCAACCAGTCAACAAATATTCATCAGGCCACTACACTTCCGGTTTCTATGAGT 
GATAGAAGCAAGAACGTCGAAGAAAGACTTGACACTTCGTCAGGTGGCTCCTCCGGTTGC 
AGCTATGGAAGGAACAAGAAAGAAACCGTTAGTGGAACAAGTGTAACCATTGACCGTAAA 
AGAAAACATGTTATGGATGCTGATCAAGAATCTGTGTCTCAATCAGATATAGGTTTGACC 
TCAACCGATGATCAAACCATGGGTAACAAATCGAGCCAACGGTCAGGATCTACTCGAAGA 
AG CCGTG CAGCTGAAGTTCATAATCTCTCAGAAAGGAGGAGGAGAGATCGGATCAATGAA 
AGAATGAAAGCTCTTCAAGAACTCATACCTCACTGCAGCAGAACAGATAAAGCTTCGATA 
TTGGATGAAGCAATTGATTACTTAAAATCACTTCAAATGCAACTCCAAGTGATGTGGATG 
GGAAGTGGAATGGCGGCGGCGGCAGCAGCAGCAGCAAGTCCGATGATGTTTCCCGGGGTA 
CAATCATCTCCATACATTAATCAGATGGCTATGCAAAGTCAGATGCAATTGTCTCAATTC 
CCGGTTATGAACCGGTCCGCTCCGCAGAACCATCCCGGTTTAGTATGTCAAAACCCGGTA 
CAGTTGCAGCTCCAAGCACAGAACCAAATCTTATCGGAGCAGCTCGCTAGGTACATGGGC 
GGGATTCCCCAGATGCCGCCGGCGGGAAATCAGATGCAGACCGTGCAACAACAACCAGCG 
GACATGTTGGGATTTGGATCTCCGGCGGGACCGCAAAGTCAACTGTCGGCACCGGCGACC 
ACCGACAGTCTTCATATGGGTAAAATAGGCTGACTTGGCATATAGTTTTCCTCCGAAATT 
ATTCTTCTTACAGTTGGTGATTGTTATTTATTTTTGGTCGC CTAAG CAAGCATAAAAGCT 
AAGTCAAATGTATTATAGAGATCTAATAAGTTAGTCTCATACTTATAACTTATTTTTAAA 
CAGTTGAATTATAGTATCAATCAAGTGTTGGGAACCTAAAGATCATACATGTGTCAATAC 
TTTTATATTTGTTCTCAAGGTTCATCAGAAAAACAAAATAAAAAGGATAGACTAGGCCTG 
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CATTTGACATTATCATGGGCTTTTTTGGGTCTATGAATATGAACATTAACCCC 

>G789 Amino Acid Sequence (domain in AA coordinates: 253-313) 

MEQVFADWNFEONFHMSTNKRSIRPEDELVELLWRDGQVVLQSQARREPSVQVQTHKQET 

LRKPNNI FLDNQETVQKPNYAAIiDDQETVS W IQYPPDDVT DPFESEFS SHFFS S IDHLGG 

PEKPRTIEETVKHEAQAMAPPKFRSSVITVGPSHCGSNQSTNIHQATTLPVSMSDRSKNV 

EERLDTSSGGSSGCSYGRNNKETVSGTSVTIDRKRKHVMDADQESVSQSDIGLTSTDDQT 

MGNKSSQRSGSTRRSRAAEVHNLSERRRRDRINERMKALQEIjIPHCSRTDKASILDEAID 

YLKSLQMQLQVMWMGSGMAAAAAAAASPMMFPGVQSSPYINQMAMQSQMQLSQFPVMNRS 

APQNHPGLVCQNPVQLQIiQAQNQILSEQLARYMGGIPQMPPAGNQMQTVQQQPADMLGFG 

SPAGPQSQLSAPATTDSLHMGKIG* 

>G2085 (1..930) 

ATGTTTGGTCGCCATTCGATTATCCCAAATAACCAGATTGGTACCGCCTCTGCTTCCGCT 
GGTGAAGACCATGTCTCTGCCTCCGCTACGTCTGGTCACATTCCTTACGACGATATGGAA 
GAAATCCCTCATCCTGACTCTATCTATGGTGCTGCCTCCGATTTGATTCCCGATGGCTCT 
CAATTGGTTGCTCACCGATCCGATGGCTCTGAATTACTTGTTTCTCGGCCACCGGAAGGG 
GCGAATCAGCTTACGATCTCGTTCCGTGGACAAGTTTACGTTTTTGATGCCGTTGGTGCT 
GACAAGGTGGATGCTGTGTTGTCGCTGTTGGGTGGTTCTACTGAGCTTGCTCCTGGTCCG 
CAGGTGATGGAACTAGCTCAACAGCAGAATCATATGCCTGTTGTAGAATATCAGAGCCGC 
TGTAGCCTTCCGCAACGGGCACAATCCTTGGATAGGTTTCGGAAGAAGAGGAATGCTAGA 
TGTTTCGAGAAGAAAGTAAGATACGGTGTTCGCCAAGAAGTTGCCTTAAGAATGGCACGT 
AATAAAGGTCAATTCACCTCTTCAAAGATGACAGATGGGGCTTATAACTCTGGCACAGAT 
CAAGATTCTGCCCAAGATGATGCCCATCCAGAAATATCGTGTACTCATTGCGGCATTAGT 
TCCAAATGTACACCAATGATGCGACGTGGCCCTTCCGGCCCCAGGACTCTCTGCAATGCC 
TGTGGACTTTTTTGGGCTAACAGGGGTACATTGAGGGATCTCTCAAAGAAAACAGAAGAG 
AATCAGTTGGCTTTAATGAAACCGGATGATGGTGGGAGTGTTGCTGATGCTGCTAACAAC 
TTAAACACTGAAGCTGCAAGTGTTGAAGAACACACTTCCATGGTTTCTCTTGCCAATGGG 
GATAATTCTAATCTGTTAGGTGATCACTAA 

>G2085 Amino Acid Sequence (domain in AA coordinates: TBD) 
MFGRHS I IPNNQIGTASAS AGEDHVSASATSGHIPYDDMEEI PHPDS I YGAASDLIPDGS 
QLVAHRSDGSELLVSRPPEGANQLTISFRGQVYVFDAVGADKVDAVLSLLGGSTELAPGP 
QVMELAQQQNHMPVVEYQSRCSLPQRAQSLDRFRKK^ 

NKGQFTSSKMTDGAYWSGTDQDSAQDDAJ^EISCTHCGISSKCTPMMRRGPSGPRTLCNA 

CGLFWANRGTLRDLSKKTEENQLALM 

DNSNIiLGDH* 

>G1783 (1..603) 

ATGGCCGCGTTTCCGCAGTGGACAAGGGTCGATGACAAACGTTTTGAGTTAGCTCTGCTT 
CAAATCCCGGAGGGTTCGCCGAATTTTATAGAGAATATCGCCTATTATCTCCAGAAACCG 
GTGAAGGAGGTGGAGTACTACTACTGCGCGTTGGTCCATGATATTGAGCGGATCGAATCG 
GGTAAGTATGTTTTGCCCAAATACCCGGAAGACGATTACGTGAAACTGACGGAAGCAGGT 
GAGTCTAAGGGCAATGGGAAAAAGACGGGAATTCCTTGGTCAGAAGAGGAACAGAGGTTG 
TTTCTGGAAGGACTAAATAAGTTTGGGAAAGGAGACTGGAAGAACATATCGAGGTATTGT 
GTGAAGTCAAGGACCTCGACGCAAGTGGC7^AGCCATGCTCAGAAGTATTTTGCAAGGCAA 
AAGCAGGAGAGTACGAATACTAAACGCCCGAGTATTCATGACATGACTCTGGGAGTTGCG 
GTCAATGTCCCTGGATCCAACTTGGAGTCTACTGGCCAGCAACCACATTTTGGTGATCAA 
ATTCCTTCGAATCAATATTATCCCTCCCAGGAAAACTTTCGGGGTTTTGATCAGCGATGG 
TGA 

>G1783 Amino Acid Sequence (domain in AA coordinates: 81.. 129) 
MAAFPQWTRVDDKRFELALLQIPEGSPNFIENI^^ 

GKYVLPKYPEDDYVKLTEAGESKGNGKKl'GIPWSEEEQRLFLEGLNKFGKGDWKNISRYC 
VKSRTSTQVASPIAQKTFARQKQESTNTKRPSIHDMTLGVAVNVPGSNLESTGQQPHFGDQ 
IPSNQYYPSQENFRGJFDQRW* 
>G2072 (155. .793) 

TCGACCCACGCGTCCGCCCACGCGTCCGGATCTTTTCACAGAAGACCAACCAGCTTGGCT 
CGATGAGCTCCTAAGTGAGC CAGCATCAC CTAAGATTAACAAAGGTCATAGACGTTCAGC 
TAGTGACACAGCTGCTTACTTGAACTCAGCTTTAATGCCTTCGAAGGAAAATCATGTTGC 
TGGTTCGTCTTGGCAGTTCCAGAACTATGATTTGTGGCAGTCCAACTCTTATGAACAACA 
CAATAAATTAGGATGGGATTTCTCTACAG CAAATGGAACTAATATCCAAAGAAATATGT C 
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ATGCGGAGCTTTAAATATGTCGTCGAAACCCATTGAGAAACATGTAAGCAAAATGAAAGA 
AGGAACTTCTACAAAACCAGATGGTCCTAGATCAAAGACTGACTCAAAACGTATCAAACA 
TCAAAATGCTCATCGAGCGCGTTTGAGAAGGCTTGAGTACATATCAGACCTTGAAAGGAC 
CATCCAAGTGCTACAAGTTGAAGGATGTGAAATGTCATCTGCCATTCACTACTTGGATCA 
GCAGTTACTCATGCTTAGCATGGAAAATAGAGCTTTAAAACAACGTATGGATAGTTTAGC 
AGAAATCCAAAAGCTTAAAGATGTGGAGCAGCAATTGCTTGAGAGAGAGATAGGAAACCT 
ACAGTTTCGACGACACCAACAACAACCACAGCAAAACCAAAAACAAGTCCAAGCAATACA 
AAATCGATACACCAAATATCAACCACCTGTTACACAAGAACCCGATGCCCAATTTGCAGC 
CTTGGCAATATGATTTAGGAAATATGGATACATTGTTCAGATTAAGCTGAGCTCCTCTTG 
CTCTACCTTAATGTCCATACAACATAGGTGAACTTGATGTTTGTAGCCTTGAATGAAAAC 
CTAAAAAAGCATCGTTATGTAAATCAAAATGTGGTTGCCCATATCCTCCTCTATTGCATT 
TCTCTCTATTTATGGCATGGTAGAGAATCTCTTGTCAAGAAACTTCATGTTATGTAATAA 
CTTGTAATCCTTCTTATTTCATCTATTATATATATGAATAAGTAATTTTTTTGCCAAAAA 
AAAAAAAAAAAAAAAAAAA 

>G2072 Amino Acid Sequence (conserved domain in AA coordinates : 90-149) 

MPSKE1IEWAGSSWQFQNYDLWQSNSYE 

EKHVSKMKEGTSTKPDGPRSKTDSKRII^ 

S SAIH YLDQQLLMLSMENRALKQRMDSLAE I QKLKHVEQQLLEREIGNLQFRRHQQQPQQ 
NQKQVQAI QNRYTKYQPP VTQEPDAQFAAIjAI * 
>G931 (85-. 1071) 

GGAGGTTCTTTGACAGACACATGTATCATCAATCTTCTCTGTTGAAGCAGAGAGAGAGAG 
AGCTAATTGTTGCCTCTGAGTCACATGGATAAGAAAGTTTCATTTACTAGCTCTGTGGCA 
CATTCAACTCCACCATACCTTAGTACTTCCATCTCATGGGGACTTCCAACCAAATCCAAT 
GGTGTGACTGAATCACTGAGTTTGAAGGTGGTAGATGCAAGACCAGAACGTCTTATAAAC 
ACAAAGAATATC^GTTTCCAGGACCAGGATTCATCTTCAACTCTGTCCTCTGCTCAATCT 
TCTAACGATGTTACAAGTAGTGGAGATGATAACCCCTCAAGACAAATCTCATTTTTAGCA 
CATTCAGATGTTTGTAAAGGATTTGAAGAAACTCAAAGGAAGCGATTTGCAATTAAATCA 
GGCTCCTCCACGGCAGGAATCGCTGATATTCACTCTTCTCCTTCCAAGGCTAACTTCTCA 
TTTCACTATGCCGATCGACATTTTGGTGGTTTAATGCCTGCGGCTTACCTACCACAGGCA 
ACAATATGGAATCCCCAAATGACTCGAGTTCCGCTACCATTCGATCTCATAGAGAATGAG 
CCTGTCTTTGTCAATGCAAAGCAATTCCATGCAATTATGAGGAGGAGGCAACAGCGTGCT 
AAGCTAGAGGCGCAAAACAAACTAATCAAAGCCCGTAAGCCGTATCTTCATGAATCTCGA 
CATGTTCACGCTCTTAAACGACCTAGAGGATCTGGTGGAAGATTCCTAAACACCAAAAAG 
CTTCAAGAATCTACAGATCCAAAACAAGACATGCCAATCCAACAGCAACACGCAACGGGA 
AACATGTCAAGATTTGTGCTTTATCAGTTGCAGAACAGCAATGACTGTGATTGTTCAACC 
ACTTCTCGCTCTGACATCACATCTGCTTCTGACAGCGTTAATCTCTTTGGACACTCTGAA 
TTTCTGATATCAGATTGCCCATCTCAGACAAACCCAACAATGTATGTTCATGGTCAATCA 
AATGACATGCATGGAGGTAGGAACACACACCATTTCTCTGTCCATATCTGAGCCGGTGGA 
ATCTGGTAATGTGTACGTTCCTACAAAAAAAGGGAAGTCATCCTTGGCTGCTACTTCGCT 
TATTAGCTAGTTCTTATTTCACACGCTTTGTCCAGATATC 

>G931 Amino Acid Sequence (domain in AA coordinates: TBD) 

MDKKVSFTSSVAHSTPPYLSTSISWGLPTKSNGVTESIjSLKVVDARPERLINTKNISFQD 

QDSSSTLSSAQSSNDOTSSGDDNPSRQISFIiAHSDVCKGFEETQRKRFAIKSGSSTAGIA 

DIHSSPSKANFSFHYADPHFGGLMPAAYLPQATIWNPQMTRVPLPFDIilENEPVFVNAKQ 

FHAIMRRRQQRAIOLEAQNKIjIKARKPYLHESRHvIIA^ 

QDMPIQQQHATGNMSRFVX,YQLQNSNDC3pCSTTSRSDITSASDSVNLFGHSEFLISDCPS 
QTNPTMYVHGQSNDMHGGRNTHHFSVHI * 
>G278 (93.. 187^) 

TCGATCTTTAACCAAATCCAGTTGATAAGGTCTCTTCGTTGATTAGCAGAGATCTCTTTA 
ATTTGTGAATTTCAATTCATCGGAACCTGTTGATGGACACCACCATTGATGGATTCGCCG 
ATTCTTATGAAATCAGCAGCACTAGTTTCGTCGCTACCGATAACACCGACTCCTCTATTG 
TTTATCTGGCCGCCGAACAAGTACTCACCGGACCTGATGTATCTGCTCTGCAATTGCTCT 
CCAACAGCTTCGAATCCGTCTTTGACTCGCCGGATGATTTCTACAGCGACGCTAAGCTTG 
TTCTCTCCGACGGCCGGGAAGTTTCTTTCCACCGGTGCGTTTTGTCAGCGAGAAGCTCTT 
TCTTCAAGAGCGCTTTAGCCGCCGCTAAGAAGGAGAAAGACTCCAACAACACCGCCGCCG 
TGAAGCTCGAGCTTAAGGAGATTGCCAAGGATTACGAAGTCGGTTTCGATTCGGTTGTGA 
CTGTTTTGGCTTATGTTTACAGCAGCAGAGTGAGACCGCCGCCTAAAGGAGTTTCTGAAT 
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GCGCAGACGAGAA.TTGCTGCCACGTGGCTTGCCGGCCGGCGGTGGATTTCATGTTGGAGG 
TTCTCTATTTGGCTTTCATCTTCAAGATCCCTGAATTAATTACTCTCTATCAGAGGCACT 
TATTGGACGTTGTAGACAAAGTTGTTATAGAGGACACATTGGTTATACTCAAGCTTGCTA 
ATATATGTGGTAAAGCTTGTATGAAGCTATTGGATAGATGTAAAGAGATTATTGTCAAGT 
CTAATGTAGATATGGTTAGTCTTGAAAAGTCATTGCCGGAAGAGCTTGTTAAAGAGATAA 
TTGATAGACGTAAAGAGCTTGGTTTGGAGGTACCTAAAGTAAAGAAACATGTCTCGAATG 
TACATAAGGCACTTGACTCGGATGATATTGAGTTAGTCAAGTTGCTTTTGAAAGAGGATC 
ACACCAATCTAGATGATGCGTGTGCTCTTCATTTCGCTGTTGCATATTGCAATGTGAAGA 
CCGCAACAGATCTTTTAAAACTTGATCTTGCCGATGTCAACCATAGGAATCCGAGGGGAT 
ATACGGTGCTTCATGTTGCTGCGATGCGGAAGGAGCCACAATTGATACTATCTCTATTGG 
AAAAAGGTGCAAGTGCATCAGAAGCAACTTTGGAAGGTAGAACCGCACTCATGATCGCAA 
AACAAGCCACTATGGCGGTTGAATGTAATAATATCCCGGAGCAATGCAAGCATTCTCTCA 
AAGGCCGACTATGTGTAGAAATACTAGAGCAAGAAGACAAACGAGAACAAATTCCTAGAG 
ATGTTCCTCCCTCTTTTGCAGTGGCGGCCGATGAATTGAAGATGACGCTGCTCGATCTTG 
AAAATAGAGTTGCACTTGCTCAACGTCTTTTTCCAACGGAAGCACAAGCTGCAATGGAGA 
TCGCCGAAATGAAGGGAACATGTGAGTTCATAGTGACTAGCCTCGAGCCTGACCGTCTCA 
CTGGTACGAAGAGAACATCACCGGGTGTAAAGATAGCACCTTTCAGAATCCTAGAAGAGC 
ATCAAAGTAGACTAAAAGCGCTTTCTAAAACCGTGGAACTCGGGAAACGATTCTTCCCGC 
GCTGTTCGGCAGTGCTCGACCAGATTATGAACTGTGAGGACTTGACTCAACTGGCTTGCG 
GAGAAGACGACACTGCTGAGT^CGACTAC^AAAGAAGCAAAGGTACATGGAAATAC^G 
AGACACTAAAGAAGGCCTTTAGTGAGGACAATTTGGAATTAGGAAATTCGTCCCTGACAG 
ATTCGACTTCTTCCACATCGAAATCAACCGGTGGAAAGAGGTCTAACCGTAAACTCTCTC 
ATCGTCGTCGGTGAGACTCTTGCCTCTTAGTGTAATTTTTGCTGTACCATATAATTCTGT 
TTTCATGATGACTGTAACTGTTTATGTCTATCGTTGGCGTCATATAGTTTCGCTCTTCGT 
TTTGCATCCTGTGTATTATTGCTGCAGGTGTGCTTCAAACAAATGTTGTAACAATTTGAA 
CCAATGGTATACAGATTTGTAATATATATTTATGTACATCAAC^TAAAAAAAAAAAAAA 
AAAA 

>G278 Amino Acid Sequence (domain in AA coordinates: 2-593) 
I^TTIDGFADSYEISSTSFVATDNTD^ 

DDFYSDAKLVIiSDGREVSFHRCVLSARSSFFKSALAAAKKEKDS™ 

YEVGFDSWTVLAYVYSSRWPPPKGVSECADENCCH^^ 

ELITLYQRHLLDWDKWIEDTLVILKI^ 

LPEELVKE I IDRRKELGLEVPKVKKHVSETVHICA^ 

FAVAYCNVKTATDLLKDDIiADVNH^ 

EGRTALMIAKQATMAVECNNI PEQCKHSLKGRLCVE ILEQEDKREQ I PRDVPPS FAVAAD 
ELKMTLLDLEimVAIiAQRLFPTEAQAAM^ 

IAPFRILEEHQSRIiKALSKTVELGKRFFPRCSAVIlDQI^INCEDLTQIJACGEDDTAEKRLQ 
KKQRYMEIQETLKKAFSEDNLELGNSSLTDSTSSTSKSTGGKRSNRKLSHRRR* 
>G2421 (1. .630) 

ATGGAGGGTTCGTCCAAAGGGTTGAGGAAAGGTGCATGGACTGCTGAAGAAGATAGTCTC 
TTGAGGCAGTGTATTGGTAAGTATGGAGAAGGCAAATGGCATCAAGTTCCTTTAAGAGCT 
GGGCTAAATCGGTGCAGGAAAAGTTGTAGACTAAGATGGTTAAACTATTTGAAGCCAAGT 
ATCAAGAGAGGAAAATTTAGTTCTGATGAAGTTGATCTTCTTCTTCGTCTTCATAAGCTT 
CTAGGAAATAGGTGGTCCTTGATTGCTGGTCGATTACCTGGTCGGACCGCTAATGATGTC 
AAGAACTACTGGAACACCCATCTGAGTAAGAAGCATGAACCGTGTTGTAAAACTAAGATA 
AAAAGGATAAATATTATAACCCCTCCTAATACACCGGCCCAAAAAGTTTGTGAAAATAGT 
ATCACATGTAACAAAGATGATGAGAAAGATGATTTTGTGGATAATTTTATGGTTGGAGAT 
AATATATGGTTGGAGCGTTTGCTAGACGAGGGCCAAGAGGTAGATGTGCTGGTTACAGAA 
GCGGCGGCAACAGAAAAGGAGGGCACTTTGGCGTTTGACGTTGAGCAACTTTGGAATTTG 
TTCGATGGAGAGACTGTGATCTTTGATTAGTGTTTATAAACGTTTGTGTTCTCTTGTTTG 
TGAGGTTTCTCTATTTAATTTAGTATCTATTTTCTAAATTAACTAATATCTTATAGTATT 
TTAGGCAAACCTTATGTTTCCGTTTCTGTGCGGCCGCTCTAG 

>G2421 Amino Acid Sequence (domain in AA coordinates: 9-110) 
MEGSSKGLRKGAWTAEEDSLLRQCIGKYGEGKWHQVPLRAGLNRCRKSCRLRWLNYTjKPS 
IKRGKFSSDEVDLLLRLHKIjLGNRWSLIAGRLPGRTAJ^ 

KRINI ITPPNTPAQKVCENS ITCNKDDEKDDFVBNFMVGDNI WLERLLDEGQEVDVLVTE 
AAATEKEGTLAFDVEQLWNLFDGETVIFD* 



242 



WO 03/013227 PCT/US02/25805 

243/286 



>G2032 (53.. 1789) 

TCCCTCCCAGAGTAAGAACTTCCATACTTTGCTCTAGATTTCTTGAGAAAAGATGCAGCC 
GATCTTCCATGCGATCCTTAAAAATGACCTTCCAGCTTTTTTAGAGTTGGTAGAAGATAG 
TGAATCGTCTCTGGAGGAGAGAAACGAGGAAGAACACTTGAACAACACGGTTTTGCACAT 
GGCTGCAAAGTTTGGTCACCGAGAACTCGTCTCCAAGATTATTGAGCTCCGACCTTCCCT 
CGTGTCTTCCCGCAACGCATACAGAAACACACCTTTGCATCTTGCTGCTATCCTTGGAGA 
TGTAAACATAGTTATGCAGATGTTAGAGACTGGATTGGAAGTGTGTTCTGCACGCAATAT 
CAACAACCACACACCACTCCACTTGGCTTGCCGTAGCAATTCCATAGAGGCTGCCAGACT 
CATCGCGGAAAAGACACAATCAATTGGCCTCGGTGAACTCATTCTCGCCATATCAAGTGG 
ATCCACTAGTATCGTAGGGACTATACTGGAGAGATTCCCAGACCTAGCTAGGGAAGAAGC 
TTGGGTGGTTGAAGACGGCTCACAATCAACGCTACTGCATCATGCGTGTGATAAGGGAGA 
CTTTGAACTGACAACTATATTGTTAGGGCTCGATCAAGGATTAGAAGAAGCACTTAACCC 
CAATGGTTTATCACCTCTGCATCTTGCGGTCCTCAGAGGCTCGGTTGTGATCCTGGAGGA 
GTTCTTGGACAAGGTTCCATTGTCTTTCAGCTCAATGACGCCGTCGAAAGAGACAGTCTT 
TCATCTCGCTGCTCGAAACAAAAATATGGATGCCTTTGTTTTTATGGCAGAGAGTTTGGG 
AATTAACAGCCAAATTCTTCTACAGCAAACCGATGAAAGTGGCAACACTGTCTTACATAT 
TGCTGCATCCGTCTCTTTTGATGCTCCTCTTATACGTTACATTGTTGGTAAGAATATAGT 
AGATATCACGTCCAAGAACAAGATGGGTTTTGAAGCTTTTCAACTTCTCCCTCGAGAAGC 
CCAAGACTTTGAGTTGTTATCAAGGTGGCTGAGATTTGGTACCGAGACTTCACAAGAGCT 
GGATTCTGAGAACAATGTAGAACAACACGAAGGCTCTCAAGAGGTCGAGGTAATACGGTT 
GCTAAGGATTAI'AGGAATAAACACATCAGAGATAGCAGAGAGAAAGAGAAGCAAGGAACA 
GGAAGTGGAAAGAGGTCGTCAGAACTTGGAATATCAGATGCATATAGAAGCATTACAGAA 
TGCAAGAAATACGATTGCTATAGTGGCAGTCTTGATTGCTTCAGTTGCTTATGCCGGTGG 
GATAAACCCTCCGGGGGGCGTCTACCAAGACGGGCCATGGAGAGGGAAATCCTTAGTGGG 
GAAAACAACGGCGTTTAAGGTCTTTGCGATATGCAACAACATCGCACTGTTCACGTCCTT 
GGGCATCGTTATTCTTCTTGTTAGCATCATACCTTACAAGAGGAAACCCTTAAAGAGATT 
ATTGGTGGCCACGCATAGGATGATGTGGGTTTCTGTAGGTTTCATGGCGACGGCTTATAT 
AGCGGCGTCTTGGGTGACCATACCGCATTATCATGGAACACAATGGTTATTTCCAGCAAT 
TGTAGCCGTTGCTGGTGGAGCGTTGACCGTACTCTTTTTCTATCTCGGAGTTGAGACCAT 
CGGTCATTGGTTTAAGAAGATGAATCGTGTAGGGGATAATATACCTTCCTTTGCAAGAAC 
CAGTTCAGATTTAG CCGTCTCCGGAAAATCAGGCTATTTCAC CTATTAAGAAAAACTGGT 
TTTCTAATTTCCCTGTAACCTGTGTAATTGTGTATGTG 

>G2032 Amino Acid Sequence (domain in AA coordinates: entire protein) 

MQPIFHAILKNDLPAFLELVEDSESSIjEERNE 

PSLVSSRNAYRNTPLmoAAILGDW 

ARLIAEKTQSIGLGELIl^ISSGSTSIVGTIL^ 

KGDFELTTILLGLDQGLEEALNPNGLSPLHLAVLRGSWIIjEEFLDKVPLSFSSITPSKE 
TVFHLAARNKNMDAFVFMAESLGXNSQIL^^ 

NT VD ITS KNKMGFEAFQLLPREAQDFELLSRWLRFGTETSQELD S ENNVEQHEGS QEVEV 
IRLLRIIGINTSEIAERKRSKEQEVERGRQNLEYQ^IEALQN^^ 

AGGINPPGGVYQDGPWRGKSLVGKTTAFKVFAICNNIALFTSLGIVILLVSIIPYKRKPL 
KRLLVATHRMMWVS VGFMATAYIAAS WVTI PHYHGTQWLFPAIVAVAGGAIiTVLFFYLGV 
ETIGHWFKKMNRVGDNI PS FARTS SDIiAVSGKSGYFTY* 
>G1396 (83.. 313) 

TCGACCTCGTTTCCTTTCCTCCTCTCTTCCTACCATTAGTACGTTACTGGAGCTGATCTC 
ACGTATATTTTGGATCGTAATCATGGACGGCGAAGATTTTGCCGGAAAGGCGGCTGCTGA 
AGCCAAGGGATTGAACCCGGGATTAATCGTGCTGCTTGTTGTTGGAGGTCCGCTTCTTGT 
GTTCCTAATCGCCAACTACGTGCTTTACGTTTATGCTCAGAAGAACCTACCTCCAAGGAA 
GAAGAAGCCCGTTTCCAAAAAGAAGCTCAAGCGGGAGAAGCTAAAGCAAGGAGTCCCTGT 
CCCTGGAGAATAAAAGCCAGCTTAAGCTTCCTTCACTTGTGCCTCCTTCAAAGCGGTTTT 
TGTTCGGTTACCAAATTTCACCCTTGCGGGTTTTTTTCTTCCTTTACTTCTGTCATGAGG 
ATTATCTTTGAGGCCT 

>G1396 Amino Acid Sequence (domain in AA coordinates: TBD) 

I^GEDFAGKAAAEAJKGLNPGLIVIjLVVGGPLLV^ 

KLKREKLKQGVPVPGE* 

>G619 (382. .2748) 

ATTTTTTTCCAATCTGCAAATTTTAGTCTATGTCTGTTCCTTGTGCTCCCTCTTCTCAGT 
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ACCTGCAAATGGAGGAAGAAGAATCCTTCTCTGAAACCCTTGTTCTCATTTGATTCCTCC 
TTCCTCTCTCTTCTTCTCTCTCTGTCTCTGATTCGTTATTCCACACTTATGACTCATCTT 
TCCCGTCAATAGCTAAGTTTGCCTCTTCTTTGTGAAATTTAGCTGAAAAAGGAGAGGAAT 
TCCGAATTCTGTCACTTCAAAGCTCGAATTTTGCAAACTTTCCTTTGATGGGTTTTACTT 
GTTTTGTTGTAATCTGATTAAAAATAGAAACTTTTTGTTTTCTTCTTGTCTCCTTTTGCT 
CTTAAAAGAGAAGCTTTTTCAATGGAATTTGACTTGAATACTGAGATTGCGGAGGTGGAA 
GAGGAGGAGAATGATGATGTAGGAGTAGGAGTAGGAGGAGGAACAAGAATTGACAAGGGT 
AGGCTTGGAATTTCACCATCTTCTTCTTCTTCATGCTCTTCCGGATCATCATCGTCATCA 
TCTTCTACAGGCTCTGCATCTTCCATTTACTCTGAGCTTTGGCATGCTTGTGCTGGTCCT 
CTCACTTGTCTTCCCAAGAAAGGCAATGTAGTTGTCTATTTCCCTCAAGGTCATTTGGAG 
CAAGATGCTATGGTTTCATATTCGTCTCCTCTTGAAATCCCCAAATTTGACCTTAATCCC 
CAAATCGTCTGCAGGGTGGTTAATGTCCAGTTGCTTGCTAATAAGGACACCGATGAGGTC 
TACACTCAAGTCACTCTGCTTCCACTTCAAGAGTTTTCGATGCTAAATGGGGAGGGGAAA 
GAGGTCAAGGAGTTAGGAGGGGAGGAAGAGAGGAACGGAAGCTCATCCGTCAAGCGGACA 
CCTCATATGTTCTGTAAAACCTTAACAGCGTCTGACACAAGCACACATGGAGGCTTCTCT 
GTACCTAGAAGAGCCGCTGAAGATTGTTTTGCTCCTCTTGACTACAAACAACAGAGGCCA 
TCTCAAGAGCTC^TTGCAAAGGACCTCCATGGAGTAGAGTGGAAGTTTCGCCATATCTAT 
AGAGGTCAACCAAGGAGGCATCTACTCACCACTGGTTGGAGTATCTTTGTCAGTCAAAAG 
AATCTCGTCTCTGGTGATGCGGTTCTCTTTCTGAGAGACGAAGGAGGAGAGCTGAGATTA 
GGAATCAGAAGAGCAGCACGGCCAAGAAATGGACTTCCTGACTCAATCATTGAGAAGAAT 
TCATGTTCAAACATTCTGTCTCTTGTGGCTAATGCTGTATCTACAAAAAGCATGTTTCAT 
GTGTTCTACAGTCCACGAGCGACGCATGCAGAGTTTGTGATTCCTTATGAGAAGTATATC 
ACAAGCATCAGGAGTCCTGTTTGCATAGGCACAAGATTTAGAATGCGATTTGAAATGGAC 
GATTCTCCTGAGAGAAGAXGCGCTGGTGTAGTGACTGGAGTCTGTGACTTGGACCCGTAT 
AGGTGGCCAAACTCTAAATGGAGGTGCTTGTTGGTGCGATGGGATGAGTCTTTTGTGAGT 
GATCATCAAGAAAGAGTTTCACCTTGGGAGATTGATGCCTCGGTTTCTCTCCCACACTTG 
AGCATTCAGTCATCTCCAAGGCCTAAAAGGCCATGGGCAGGTTTACTGGATACTACCCCA 
CCCGGAAACCCCATAACAAAiU^GGGGTGGTTTTTTGGACTTTGAGGAGTCGGTTAGACCC 
TCTAAGGTCTTGCAAGGTCAAGAAAATATAGGTTCTGCATCACCCTCACAGGGGTTTGAT 
GTTATGAACCGCCGGATACTGGATTTTGCGATGCAGTCTCATGCAAATCCAGTCCTTGTG 
TCGAGTAGAGTCAAGGATCGATTTGGTGAGTTTGTAGATGCTACTGGCGTGAACCCAGCT 
TGTTCAGGTGTTATGGACCTGGATAGGTTTCCAAGGGTCTTGCAAGGTCAAGAAATTTGC 
TCGCTTAAATCATTCCCGCAATTTGCTGGTTTCAGTCCAGCTGCTGCTCCTAATCCCTTT 
GCTTACCAAGCCAACAAGTCAAGTTACTATCCGCTAGCTTTGCATGGGATTAGGAGCACT 
CATGTTCCGTATCAGAATCCATACAATGCGGGAAACCAATCCTCGGGTCCCCCTTCACGT 
GCAATAAACTTTGGTGAAGAGACTAGAAAGTTTGATGCACAAAATGAAGGTGGCCTACCA 
AATAATGTTACAGCTGATTTGCCATTCAAGATTGATATGATGGGAAAACAGAAAGGCAGT 
GAGTTGAATATGAATGCTTCATCAGGATGTAAACTTTTCGGATTCTCCTTACCAGTGGAG 
ACACCTGCATCTAAGCCGCAAAGCTCGAGCAAAAGAATCTGTACAAAGGTTCACAAGCAA 
GGAAGCCAAGTGGGGAGAGCTATTGATTTGTCGCGACTTAACGGGTATGATGATCTCCTT 
ATGGAGCTTGAACGGCTGTTCAACATGGAAGGGCTTCTCAGGGATCCTGAAAAAGGATGG 
AGGATCTTATATACTGATAGTGAGAACGATATGATGGTCGTTGGCGATGATCCATGGCAT 
GATTTCTGCAATGTGGTGTGGAAGATACACTTATACACGAAAGAGGAAGTGGAGAATGCG 
AATGACGATAACAAGAGTTGTTTAGAGCAAGCTGCTCTCATGATGGAAGCATCAAAGTCA 
TCTTCTGTGAGCCAGCCTGATTCTTCTCCTACAATCACTAGGGTTTGATACCCATAAAGA 
AGCTTATTTCCTATGTTTTAAAGTGTGTTTTGCTCACAAAAGAACTTCAACTTTATCTTT 
GTCTTTGAATCCATTTATGTGTTTGTTTGTGTTTCTTCTGGTCTCCATGGATGTCTCATG 
TGTACCGTTTTACTeGAGAGATATGTGAGTTTATGGGATGTGTAAAGCATGCCATTGGAT 
TTTAAGGTTTTCAAAATTAGAATATATATATTAGTTTTGAAGTTAAAAAAAAAAAAAAAA 
A 

>G619 Amino Acid Sequence (domain in AA coordinates: 64-406) 
MEFDLNTE I AEVEEEENDDVGVGVGGGTR IDKGRLG I S PSS S S S CS SGSSSSSS STGS AS 
SIYSELWHACAGPLTCLPKKG^^VVWFPQGHLEQDAlWSYSSPIiEIPKFDLNPQIVCRVV 
NVQIjLANKDTDEVYTQVTLLPLQEFSMLNGEG 

IiTASDTSTHGGFSVPRRAAEDCFAPLDYKQQRPSQELIAKDLHGVEWKFRHIYRGQPRRH 
LLTTGWS IFVSQKNLVSGDAVLFLRDEGGELRLGIRRAARPRNGLPDS I IEKNS CSNILS 
LVANAVSTKSMFHVF YS PRATHAEFVI P YEKYI TS IRS PVC I GTRFRMRFEMDDS PERRC 
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AGWTGVCDLDPYRWPNSKWRCLLVRWDESFVSDHQERVSPWEIDPSVSLPHLSIQSSPR 
PKIIPWAGLLDTTPPGNPITKRGGFIjDFEESVRPSKVLQGQENIGSASPSQGFDVMNRRIL 
DFAMQSXIANPVLVSSRVKDRFGEFVDATGVNPACSGVMDIjDRFPRVIjQGQEICSLKSFPQ 
FAGFSPAAAPNPFAYQANKSSYYPLALHGIRSTHVPYQNPYNAGNQSSGPPSRAINFGEE 

trkfdaqnegglpnnvtadlpfkidmmgkqkgselnmnassgcklfgfslpvetpaskpq 
ssskri ctkvhkqgs qvgraidlsrlng yddllmelerl fnmegiilrdpe kgwri lytds 
endmmwgddpwhdfcnvwkihlytkeev 
ssptitrv* 

>G2295 (33.. 917) 

GTAATATATAACAATAACTCAGGTTACAAAGGATGGTTCCGAAAGTGGTCGACCTACAAA 
GGATAGCGAACGATAAGACAAGGATAACAACTTACAAGAAGAGGAAAGCTAGTCTTTACA 
AGAAGGCACAAGAGTTCTCAACTCTCTGCGGCGTCGAGACATGTCTCATCGTCTACGGTC 
CCACGAAGGCTACCGATGTGGTGATTTCCGAGCCAGAGATATGGCCGAAGGACGAGACCA 
AAGTCAGGGCCATCATACGCAAGTACAAAGACACAGTGTCGACCAGCTGCAGGAAAGAAA 
CCAACGTGGAGACTTTCGTCAACGATGTAGGGAAAGGAAACGAGGTGGTGACTAAAAAGA 
GAGTGAAGCGTGAGAATAAGTATTCTAGTTGGGAGGAGAAGCTAGACAAGTGTTCACGAG 
AGCAACTACATGGGATTTTCTGTGCCGTGGATAGCAAGTTAAATGAAGCTGTAACGAGAC 
AGGAGCGTAGTATGTTTAGGGTTAATCATCAAGCCATGGACACACCATTCCCGCAGAATT 
TAATGGACCAACAATTCATG CCACAGTATTTTCATGAG CAGCCACAGTTTCAAGGCTTCC 
CTAATAATTTCAATAATATGGGTTTCTCGTTGATTTCACCTCATGATGGTCAGATTCAAA 
TGGACCCAAATCTCATGGAGAAGTGGACCGACTTGGCTTTGACTCAAAGCTTGATGATGT 
CAAAGGGAAACGATGGTACTCAATTCATGCAGAGGCAAGAACAACCATACTATAATCGTG 
AACAGGTTGTATCGAGGTCTGCAGGTTTCAATGTTAACCCGTTTATGGGATATCAAGTCC 
CGTTTAATATTCCTAATTGGAGATTATCGGGAAATCAAGTTGAAAATT(3GGAGCTTTCAG 
GGAAGAAAACGATATGATTTGAATTACGGAGCTTTATTAGTTTTTAGGGTTTTATAGTTT 
TG 

>G22 95 Amino Acid Sequence (domain in AA coordinates: TBD) 
MVPKVVDLQRIANDKTRITTYKKRKAS^ 

• PEIWPKDETKVRAI IRKYKDTVSTS CRKETNVETFVOT)VGKGNEVWKICRVKRENKYSSW 
EEKLDKCSREQLHGIFCAVDSKLNEAVTRQERSM 
HEQPQFQGFPlTOFNNMGFSIiISPHDGQIQlvro^ 

RQEQPYYNREQWSRSAGFimTPFMGYQVPFNIPNWRLSGNQVENWELSG * 
>G312 (1. .1755) 

ATGGCTTACATGTGCACTGATAGTGGCAATCTAATGGCTATTGCTCAACAAGTCATCAAA 
CAGAAGCAGCAACAAGAACAACAACAGCAGCAACATCATCAAGACCATCAGATTTTTGGT 
ATTAATCCTTTGTCTCTTAACCCATGGCCCAATACTTCCCTCGGGTTTGGGCTTTCAGGT 
TCGGCTTTTCCCGACCCGTTTCAAGTTACCGGCGGCGGAGATTCCAACGATCCTGGCTTT 
CCTTTTCCTAACTTAGACCACCACCACGCCACAACCACCGGCGGTGGGTTCAGGTTATCT 
GATTTCGGCGGTGGAACCGGCGGCGGCGAGTTTGAGTCCGACGAGTGGATGGAGACTCTT 
ATCAGCGGTGGAGACTCCGTTGCAGACGGTCCTGATTGTGACACCTGGCATGATAATCCC 
GATTACGTAATCTACGGTCCTGATCCATTCGATACTTACCCGAGTCGACTCAGTGTCCAA 
CCGTCAGATCTAAACCGAGTCATTGACACGTCGAGTCCGCTTCCTCCGCCGACCTTGTGG 
CCTCCTTCTTCGCCATTATCGATTCCTCCGCTTACTCATGAGTCACCAACCAAAGAAGAT 
CCAGAGACTAACGACTCCGAAGACGATGACTTCGACCTAGAACCACCTCTCCTCAAAGCT 
ATATACGACTGTGCACGGATCTCAGACTCTGAGCCTAACGAAGCTTCCAAGACGCTTCTT 
CAGATCCGAGAATCTGTATCGGAGCTAGGTGATCCGACGGAGCGAGTTGCATTTTACTTC 
ACGGAAGCTCTCTCCAACAGACTGTCTCCTAATTCGCCGGCGACGTCGTCTTCTTCTTCA 
TCTACGGAGGATTTAATCTTATCTTATAAAACCCTAAACGACGCTTGTCCTTACTCCAAA 
TTCGCACATTTGACGGCGAATCAAGCGATTCTAGAAGCGACGGAGAAGTCGAACAAGATT 
CACATCGTCGATTTTGGAATCGTTCAAGGTATACAATGGCCTGCTCTTCTTCAAGCTCTA 
GCTACTCGTACTTCTGGTAAACCCACTCAAATCCGGGTCTCGGGTATACCCGCTCCATCT 
CTCGGTGAATCTCCGGAACCGTCGTTAATCGCCACCGGAAACCGCCTCCGTGATTTCGCC 
AAGGTTCTGGATCTGAATTTCGATTTCATCCCAATTCTCACTCCCATACATTTACTTAAC 
GGGTCAAGTTTCCGGGTCGACCCGGATGAAGTACTGGCCGTGAATTTCATGCTCCAGCTC 
TACAAATTACTCGACGAGACGCCGACGATAGTTGACACCGCACTACGGCTCGCCAAATCG 
TTGAACCCGAGGGTCGTCACTCTCGGAGAATACGAAGTGAGCTTAAACCGGGTCGGTTTC 
GCTAACCGGGTAAAGAACGCGCTTCAATTCTATTCCGCGGTTTTCGAATCCCTTGAACCG 
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AACTTGGGGCGTGATTCGGAGGAGAGAGTGAGAGTTGAGCGAGAGTTGTTCGGCCGGAGA 
ATCTCGGGTTTGATTGGACCGGAGAAAACCGGAATTCATAGAGAAAGAATGGAAGAGAAA 
GAG C AATGG CGGGTATTAATGGAGAATGCCGGTTTTGAATCGGTTAAGCTGAGTAATTAC 
GCAGTGAGCCAAGCGAAGATTCTATTGTGGAATTACAATTACAGCAATTTGTATTCAATT 
GTTGAATCTAAGCCTGGCTTCATCTCTTTGGCCTGGAACGATTTACCTCTCCTCACTCTT 
TCTTCCTGGCGATAA 

>G312 Amino Acid Sequence (domain in AA coordinates: 320-336) 

MAYMCTDSGNLMAIAQQVIKQKQQQEQQQQQHHQDHQIFGINPLSIiNPWPNTSIiGFGIiSG 

SAFPDPFQVTGGGDSNDPGFPFPNLDHHHATTTGGGFRLSDFGGGTGGGEFESDEWMETL 

ISGGDSVADGPDCDTWHDNPDYVIYGPDPFDTYPSRLSVQPSDLNRVIDTSSPLPPPTLW 

PPSSPLSIPPLTHESPTKEDPETNDSEDDDFDLEPPLLKAIYDCARI 

QIRESVSELGDPTERVAFYFTEAIjSNRIiSPNSPATSSSSSSTEDIjIIjSYKTLNDACPYSK 

FAHLTANQAILEATEKSNKIHIVDFGIVQGIQWPALLQALATRTSGKPTQIRVSGIPAPS 

LGESPEPSLIATGNRLRDFAKVLDLN^ 

YKXiLDETPTIVDTALRLAKSIiNPRVVT 

NLGRDSEERVRVERELFGRRISGLIGPEKTGIHRERMEEKEQWRVLMENAGFESVKLS 
AVS QAKI LLWNYNYSNIiYS I VESKPGF I SLAWNDLPLLTLS SWR* 
>G1444 (192 . .1001) 

AATCCCCTATCCTTCGCAAGACCCTTCCTCTATATAAGGAAGTTCATTTCATTTTTTTTT 
GACACGCTGACAAGCTGACTCTAGCATATCTGGCACCGGCGACCAGTCCTTCTTTGGTGC 
AAAGATCCCAAAAAATCAAAATCGAAAGAGAGAATAAATCAAAAGGAAGAATCTTTATCT 
GCTTTCTCTCGATGAGGATCCGGAAACGACAAGTGCCTCTTCCTTTATCGTCTCTATTAC 
CAGTTCCTCTATCAGATCTCTACTTTAACCGCTCACCGACGGCCACCGCGAGATACTTTC 
GCGGTGGTTATAAAGACGGCGGTGATGATTTTGGTTCTCTTCAGCTTTCGCTTCCGCCGC 
CGTCGCAGATTTCTGATCGGCTTATTCAAAGAGATTTGATAAAGAAGAAGGAGGAGGTCA 
AGGCTTTGGATGATGATAATGGTGATGTAGACGTCAAGAGTCGTACTGATGCATCGGGCA 
GCAAGAATGTTAATCCCCGAGGAGAATCCGTCTCTTCAATACAAGTTGTCGAGAAGAATG 
AAAAGGTTGTGTCTTTGAGGAAGAGAAGAGGCTTTATCAACTTTGAGGATTACGAAGATG 
AGGAAGATGAAGAAGGTAGTGGCGGTGGAGGCCGTATTAATAAAGGGAAAAAGAAAGCGA 
AAAAGAGCGGTGGTGGGTTAGAGGAAGGATCACGGTGCAGCCGTGTTAACGGTAGAGGAT 
GGAGATGTTGTCAGCAAACGCTTGTTGGTTATTCTCTTTGTGAGCATCATCTCGGTAAAG 
GAAGGGTAAGGAGCATGAACAAGAGTGGTGGTGGTCGTGGCGGCGAGAAAAAGGCGGTGG 
TGGTGGAAGTGAAGAAGAAGAGAGTAAAGCTTGGCATGGTAAAGGCACGTTCAATAAGTA 
GTTTGCTTGGACAAACCAGCACTAGTGGTGGTACTAGTGGTGATGTTGATCAGGGTGAGA 
TAAGTGCACCTGCTGATCAGTTCGCTGCATGTGATAAGTAGGTCTGTTGATCAGCATTTG 
CATGTATATGGATATGTGTATGTTTATGTACATGATGATAATGGGCATAGCGCGGCCGCT 
CTAGACAGGCCTGGAACCGGATCCTCTAGCTAGAGCTTTCGTTAGTATCATCGGGTTTAG 
ACAACGTT 

>G1444 Amino Acid Sequence (domain in AA coordinates: 168-193) 

MRIRKRQVPLPLSSLLPVPLSDLYFKRSPTATARYFRGGYKDGGDDFGSLQLSLPPPSQI 

SDRLIQRDLIKKKEEVKALDDDNGDVDVKSRT 

SLRKRRGFINFEDYEDEEDEEASGGGGRINKGKKKAECKSGGGLEEGSRCSRVNGRGWRCC 
QQTLVGYSLCEHHLGKGRWSMNKSGGGRGGEKKAVVVEVKKKRVKL 
QTSTSGGTSGDVDQGEISAPADQFAACDK* 
>G801 (27.. 746) 

GATAGTGATAACGAAATCCTAATTCCATGGCCGACAACGACGGAGCAGTGAGTAACGGCA 
TCATAGTCGAGCAGACGTCAAACAAAGGACCTCTTAACGCCGTTAAGAAACCACCGTCTA 
AAGATCGACACAGGAAAGTTGACGGAAGAGGAAGAAGGATTCGTATGCCAATCATTTGCG 
CAGCTCGAGTTTTTCAATTGACCAGAGAGTTAGGTCACAAGTCCGATGGTCAAACCATAG 
AGTGGCTTCTCCGTCAAGCTGAGCCTTCTATCATAGCCGCCACTGGAACTGGCACTACTC 
CGGCGAGTTTCTCCACTGCTTCTCTCTCCACTTCTTCTCCGTTTACTCTCGGGAAACGTG 
TCGTCAGAGCGGAGGAAGGAGAATCCGGCGGCGGAGGAGGAGGAGGGTTAACAGTGGGAC 
ACACAATGGGGACTTCGTTAATGGGTGGTGGTGGTTCTGGTGGGTTTTGGGCTGTTCCGG 
CGAGGCCGGATTTCGGACAAGTCTGGAGCTTTGCAACCGGAGCTCCACCGGAAATGGTTT 
TTGCGCAGCAGCAGCAACCAGCTACACTCTTCGTCCGCCACCAGCAGCAACAGCAAGCTT 
CCGCCGCCGCAGCAGCTGCAATGGGTGAGGCTTCAGCAGCTAGAGTTGGGAATTATCTTC 
CGGGTCATCATCTCAATTTGCTTGCTTCTTTGTCTGGTGGAGCTAACGGGTCGGGTCGGA 
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GGGAAGACGACCACGAACCACGTTGAGAAA.TGGTATTGTCTTTTTGGTAATGTATAGAAA 

AATTCCTATGTTTTATGTCATCGAAAGTGTTTAGAAAGTACCTCTAATTTGCGGTTTCTT 

TTGCTCCTTTTTTACTTAATTTAAGCTTATTGCTTGTTTGATTAGGGTTTTAGGGTTTAA 

GAATATTTGGTCTCGTTAATTTGTTTCGGAGAGTGATAGAAAGAGAGAGAGATTGATTGA 

TTGTTGTACCTAAAACGCTATAAAAGCTCTGTTTTTACTAGCGAAAAAA 

>G801 Amino Acid Sequence (domain in AA coordinates: 32-93) 

MADNDGAVSNGIIVEQTSNKGPIjNAVKKPPSKDRHSKVDGRGRRIRMPIICAARVFQLT^ 

ELGHKSDGQTIEWLIiRQAEPSIIAATGTGTTPASFSTASIjSTSSPFTLGKRWRAEEGES 

GGGGGGGLTVGHTMGTSLMGGGGSGGFWAVPARPDFGQVWSFATGAPPEMVFAQQQQPAT 

LFVRHQQQQQASAAAAAAMGEASAARVGIT^PGHHIjNIjLASIiSGGANGSGRRE 

>G1950 (42. .764) 

CTGAATTCGAACTTTGGAAGAAGAAGAAGCTTTGATCAATCATGGAAATTGCAACCGATA 
CAGCAAAGCAGATGAGAGACGAAGAGTTGTTCAAAGCAGCGGAATGGGGAGATTCATCGT 
TGTTCATGTCATTATCTGAAGAACAGCTCTCTAAATCTCTCAATTTCAGAAACGAAGATG 
GTCGCTCTCTCCTCCATGTCGCTGCTTCCTTCGGCCATTCTCAAATAGTGAAGTTGTTAT 
CAAGTTCAGATGAAGGAAAGACTGTAATCAATAGCAAGGATGATGAAGGATGGGCTCCTT 
TGCATTCCGCTGCTAGCATCGGTAATGCTGAGCTCGTTGAGGTGCTTTTGACCAGAGGTG 
CTGATGTCAATGCCAAAAATAACGGTGGTCGCACTGCTCTTCACTATGCTGCTAGCAAAG 
GCCGGTTGGAGATTGCTCAGCTTTTATTAACACACGGTGCAAAGATTAACATCACAGACA 
AGGTTGGTTGCACTCCGCTTCAGAGGGCAGCAAGCGTGGGAAAGTTAGAAGTTTGTGAAT 
TTCTTATTGAAGAAGGAGCAGAGATCGATGCTACGGATAAAATGGGTCAAACTGCACTCA 
TGCATTCAGTTATCTGCGATGACAAACAGGTTGCGTTCCTGCTTATAAGACATGGTGCAG 
ATGTGGATGTAGAAGACAAGGAAGGCTACACTGTTCTAGGCCGAGCTACCAATGAATTCC 
GACCTGCACTTATCGATGCTGCTAAGGCCATGCTTGAAGGATAAAATGACTCTGGATTAC 
TTTAAAACTTACTAACTCTGAGAGTTGTTTAGTTACTTAAAAGGATTXTTCTTTACTGTA 
TCATGTTTGCAAAATGTTTCTGCCTTATCAATTCATGTTCTGT 

>G1950 Amino Acid Sequence (domain in AA coordinates: 65-228) 
MEXATDTAKQMRDEELFKAAEWGDSSLFMSLSE 

QI VKLLS S SDEAKTVTNS KDDEGWAPLHS AAS IGNAELVEVLIjTRGADVNAKNNGGRTAIj 
HYAASKGRLEIAQIiLLTHGAKINITDKVGCTPLHRAASVGKLEVCEFLIEEGAEIDATDK 
MGQTALMHS VI CDDKQVAFLL IRHGADVDVED KSG YTVLGRATNEFRPAIj IDAAKAMLEG 
* 

>G958 (55. .1950) 

CGTCGACATGTTCATATTTGTTTCTAGCTAAGAAGTTTGTATAAGGCAGTGGACATGGCT 
CCTGTTTCAATGCCTCCAGGTTTCCGGTTTCATCCAACAGACGAAGAGCTTGTCATATAC 
TACCTCAAGCGAAAGATTAATGGTCGGACTATTGAGTTAGAGATAATACCCGAGATTGAT 
CTTTACAAATGCGAACCTTGGGATTTACCTGGGAAGTCCTTGCTGCCAAGTAAAGACCTA 
GAATGGTTCTTTTTCAGTCCTCGAGACCGGAAATATCCAAACGGATCAAGAACAAACCGG 
GCGACCAAAGCAGGTTACTGGAAAGCCACCGGGAAAGATCGTAAAGTGACTTCACATTCA 
CGGATGGTTGGAACAAAGAAAACATTAGTTTATTACCGAGGAAGAGCGCCTCATGGCTCT 
CGTACCGATTGGGTCATGCACGAGTACCGTCTTGAAGAACAAGAATGTGACTCTAAATCC 
GGTATACAGGATGCCTATGCACTTTGTCGAGTATTTAAGAAGAGTGCTTTAGCCAACAAA 
ATTGAAGAACAACACCATGGTACGAAGAAGAACAAAGGAACGACTAATAGTGAACAATCT 
ACTTCTAGTACTTGTTTGTATTCTGATGGAATGTATGAAAACCTCGAAAACTCGGGGTAT 
CCAGTCTCACCTGAGACAGGAGGCTTAACTCAACTCGGTAATAATTCGTCGTCGGATATG 
GAAACGATAGAGAATAAATGGAGTCAGTTTATGTCGCATGACACGTCCTTCAACTTCCCA 
CCTCAGTCTCAATATGGAACAATCTCATATCCTCCCTCGAAGGTTGATATAGCGTTAGAG 
TGTGCAAGACTACAAAATCGTATGTTGCGACCAGTACCACCACTTTACGTAGAAGGTCTC 
ACACACAATGAATATTTTGGAAACAATGTAGCTAACGATACAGATGAAATGTTGAGCAAG 
ATTATAGCATTGGCTCAAGCCTCACATGAGCCACGAAACAGTCTAGACTCATGGGACGGT 
GGTTCTGCTTCCGGGAACTTCCATGGAGACTTTAACTATTCCGGAGAAAAAGTCTCATGC 
CTAGAGGCGAACGTGGAGGCTGTAGATATGCAAGAACACCATGTGAATTTTAAGGAAGAA 
AGACTTGTTGAAAACTTGAGATGGGTAGGAGTATCAAGCAAGGAACTTGAAAAGAGCTTC 
GTTGAAGAACACTCAACGGTAATTCCTATAGAAGATATTTGGAGATATCATAATGATAAT 
CAAGAACAAGAACATCATGATCAAGATGGTATGGACGTTAACAACAACAATGGAGATGTG 
GATGATGCTTTCACACTCGAGTTTTCGGAAAACGAACATAACGAG7VATCTTTTGGACAAG 
AACGATCATGAGACAACGAGTTCCTCATGTTTTGAGGTGGTAAAAAAAGTTGAGGTTAGC 
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CATGGATTGTTTGTCACAACTCGTCAGGTAACCAACACATTCTTCCAACAGATAGTACCA 
TCGCAAACCGTTATAGTTTATATAAATCCGACGGATGGCAATGAGTGTTGTCATAGTATG 
ACATCAAAAGAGGAGGTTCATGTCCGTAAAAAGATAAATCCGCGAATCAACGGAGTAAGC 
TCAACAGTTCTTGGACAATGGAGAAAATTCGCGCATGTTATTGGCTTCATTCCTATGCTT 
CTATTGATGCGTTGTGTTCATCGAGGTAACTCTAACAAAAACAGAGGCAGTGAAGGTTAC 
TCGAGGCAGCCTACGAGAGGAGATTGTAACAATCGGGGAACAATACTCATGATGGAAAAT 
GCTGTCGTGAGAAGAAAAATTTGGAAGAAGAAGAAAGAGAAAAATATGGTTGACGAACAA 
GGTTTTCGGTTTCAAGATAGTTTCGTATTGAAGAAGTTGGGGCTTTCTCTTGCTATCATC 
TTAGCTGTTTCTACCATAAGTCTTATTTGAATACTGAGGTTCAATATATCATATATGGCT 
TTTCACTTTTCTATTGTACTCCCATTTGCCTAGGTCGTATGC 

>G958 Amino Acid Sequence (conserved domain in AA coordinates : 7 - 156) 

MAPVSMPPGFRFHPTDEELVIYYLKRKINGRTIEDEIIPEIDLYKCEPWDLPGKSLLPSK 

DLEWFFFSPRDRKYPNGSRTNRATKAGYWKATGKm 

GSRTDVA/MHEYRIjEEQECDSKSGIQDAYALCRVFKKSALANKIEEQHHGTKKNKGTTNSE 

QSTSSTCLYSDGMYENLENSGYPVSPETGGLTQLGNNSSSDMETIENKWSQFMSHDTSFN 

FPPQSQYGTISYPPSKVXJIALECARLQNRMLPPVPPLYV^GIjT^ 

SKIIAIiAQASHEPRNSLDSWDGGSASGNFHGDFNYSGEKVSCLI^^ 

EERLVENLROTGVSSKELEKSFVEEHSTVIPIEDIWRYHNDNQEQEI^ 

D VDDAFTLEFS ENEHNENLLDKNDHETTS S S CFE WKKVEVSHGLFVTTRQVTWTFFQQI 

VTSQTVIvYINPTDGNECCHSMTSKEEVHVRKKINPRINGVSSTVXGQWRKFAHVIGF 

MLLLMRCVHRGNSNKNRGSEGYSRQPTRGD^ 

EQGFRFQDS FVXiKKLGLS LAI ILAVSTI SI»I * 

>G1037 (1..1722) 

ATGACTGTTGAACAAAATTTAGAAGCTTTGGATCAGTTTCCTGTAGGAATGAGAGTTCTT 

GCTGTTGATGATGACCAAACTTGTCTCAAAATCCTTGAATCTCTCCTTCGTCACTGCCAA 

TACCATGTAACAACGACGAACCAAGCACAAAAGGCTTTAGAGTTATTGAGAGAGAACAAG 

AACAAGTTTGATCTGGTTATTAGTGATGTTGACATGCCTGACATGGATGGTTTCAAACTC ' 

CTTGAGCTTGTTGGTCTTGAAATGGACCTACCTGTCATAATGTTGTCTGCGCATAGTGAT 

CCAAAGTATGTGATGAAGGGAGTTACTCATGGTGCTTGTGATTATCTACTGAAGCCGGTT 

CGTATTGAGGAGTTGAAGAACATATGGCAACATGTCGTGAGAAGTAGATTTGATAAGAAC 

CGTGGGAGTAATAATAATGGTGATAAGAGAGATGGATCAGGTAATGAAGGTGTTGGGAAT 

TCTGATCCGAACAATGGGAAAGGTAATAGAAAACGTAAAGATCAGTATAATGAAGATGAG 

GATGAGGATAGAGATGATAATGATGATTCGTGTGCTCAAAAGAAGCAACGTGTTGTTTGG 

ACTGTTGAGCTGCATAAGAAATTTGTTGCAGCTGTTAACCAATTGGGATATGAGAAGGC^ 

ATGCCTAAAAAGATTTTGGATCTGATGAATGTTGAGAAGCTCACTAGAGAAAATGTGGCC 

AGTCATCTTCAGAAATTCCGCCTTTACTTGAAGAGGATCAGTGGTGTGGCTAATCAGCAA 

GCTATTATGGCAAACTCTGAGTTACATTTTATGCAAATGAATGGACTTGATGGTTTCCAT 

CACCGCCCAATCCCTGTTGGATCTGGTCAGTACCATGGTGGGGCTCCTGCAATGAGATCT 

TTCCCTCCAAACGGGATTCTTGGCAGACTCAATAGCTCTTCGGGGATCGGTGTCCGCAGC 

CTTTCTTCTCCTCCTGCAGGAATGTTCTTGCAAAACCAGACCGATATCGGAAAGTTTCAC 

CATGTCTCATCACTTCCTCTTAACCACAGTGATGGAGGAAACATACTTCAAGGGTTGCCA 

ATGCCTTTAGAGTTCGACCAGCTTCAGACAAACAACAACAAAAGTAGAAACATGAACAGT 

AACAAGAGCATTGCTGGGACCTCCATGGCTTTTCCTAGCTTCTCTACGCAACAAAACTCG 

CTCATCAGTGCTC CTAATAACAATGTCGTGGTTCTAGAAGGTC ACC CACAAGCAACTCCT 

CCAGGCTTCCCAGGACACCAGATCAATAAACGTTTGGAGCATTGGTCAAATGCTGTATCC 

TCTTCGACTCACCCTCCTCCCCCGGCACATAACAGTAATAGTATCAATCATCAGTTCGAT 

GTCTCTC CATTACCGCATTCTAGACC CGACCCCTTGGAATGGAACAATGTGTCATCAAGC 

TACTCTATACCATTCTGTGACTCTGCCAATACATTGAGTTCTCCAGCCTTGGATACAACA 

AATCCCCGAGCTTTCTGTAGAAACACGGACTTCGATTCAAACACAAATGTGCAACCTGGA 

GTCTTTTATGGTCCATCCACGGATGCTATGGCTCTGTTGAGTAGTAGTAACCCGAAAGAA 

GGGTTCGTCGTAGGCCAACAGAAGTTACAGAGTGGTGGATTCATGGTTGCAGATGCTGGT 

TCCTTAGATGATATAGTCAACTCCACGATGAAGCAGGTGTGA 

>G1037 Amino Acid Sequence (domain in AA coordinates: 11-134, 200-248) 
MTVISQNLEALDQFPVGMRVLAVDDDQTCLK^ 

l^FDLVISDVXJMPDMDGFKLLELVGLEI^LPVIMLSAHSDPKYVT^KG 

RIEELKWIWQHVVRSRFDIOSrRGSNNNGDKRDGSGNEGVGNSDPlTM 

DEDRDDNDDSCAQKKQRVWTV^LHK^ 
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SHLQKFRLYXiKRISGVANQQAIMANSELHFMQMNGLDGFHHRPIPVGSGQYHGGAPAMRS 

FPPNGILGRIiNSSSGIGVRSLSSPPAGMFLQNQTDIGKFHHVSSLPLNHSDGGNILQGLP 

MPLEFDQLQTNNNKSRNMNSNKS IAGTSMT^PSFSTQQNSLISAPNKNVVVLEGHPQATP 

PGFPGHQINKRLEHWSNAVSSSTHPPPPAHNSNSINHQFDVSPLPHSRPDPLEWNNVSSS 

Y S I PFCDS ANTLS S PAIiDTTNPRAFCRNTDFDS NTNVQPGVF YGP S TDAMALLS SSNPKE 

GFWGQQKLQSGGFMVADAGSLDDIVNSTMKQV* 

>G2065 (33.. 1124) 

AACCACACAAAACAAAACAAAAAAACATATTGATGGGGATGAAGAAGGTAAAGCTATCTT 
TGATAGCTAATGAAAGATCAAGGAAAACATCCTTCATGAAGAGGAAAAACGGGATATTCA 
AGAAACTCCACGAGTTGTCAACTCTATGTGGTGTCCAAGCTTGTGCTCTCATCTATAGTC 
CATTCATACCGGTTCCAGAGTCATGGCCGTCAAGGGAAGGTGCTAAAAAGGTAGCTTCAA 
AGTTTCTGGAGATGCCGCGGACAGCCCGAACCAGGAAGATGATGGATCAAGAAACCCATC 
TTATGGAGAGGATTACCAAAGCAAAAGAGCAACTAAAGAATTTGGCTGCTGAGAACCGAG 
AATTACAGGTTAGACGATTTATGTTTGATTGTGTTGAAGGCAAAATGTCCCAGTATCGTT 
ATGATGCAAAAGACCTTCAAGATTTGCTATCTTGTATGAATCTATATCTCGATCAGCTTA 
ACGGAAGGATCGAGTCCATTAAAGAAAACGGTGAGTCGTTGTTGTCTTCCGTCTCTCCTT 
TTCCTACTAGAATTGGTGTTGACGAAATTGGTGATGAGTCGTTTTCCGACTCTCCTATTC 
ATTCTACAACTAGGGTTGTAGATACTCCTAATGCTACCAATCCTCATGTTCTTGCGGGCG 
ATATGACTCCTTTTCTTGATGCGGACGCAAATGCGGTAACTGCTCCCAGTCGATTTTCTG 
ATCATATTCAATATGAAAATATGAATATGAGTCAAAATCTGCATGAACCGTTTCAACACC 
TTGTTCCTACTAACGTTTGTGATTTTTATCAAAATCAGAATATGAATCAGGTTCAATACC 
AGGCTCCTAATAATCTGTTTAATCAGATTCAACGAGAATTCXACAACATAAATTTGAATC 
TGAATTTGAATCTGAATTCAAATCAGTATCTGAATCAACAACAATCATTGATGAATCCGA 
TGGTGGAACAACATATGAATCATGTTGGAGGGCGTGAAAGCATTCCTTTCGTGGACAGAA 
ACTACTACAACTACAATCAACTACCAGCCGTTGATCTTGCTTCCACCAGTTACATGCCTT 
CAACCACCGATGTTTATGATCCTTACATCAACAACAATCTCTAATCACAAAAGACGGAGA 
TTTTCTAGTTTAA 

>G2065 Amino Acid Sequence (domain in AA coordinates: TBD) 

MGMKKVKLSLIANERSRKTSFMKRKNGIFK 

REGAKKVASKFLEMPRTARTRKMI^QETm 

VEGKMSQYRYDAKDLQDLLSCMNLYLDQLNGRIESIKENGESLLSSVSPFPTRIGVDEIG 
DESFSDSPIHSTTRVVDTPNATOPHVLAGDMT^ 
QNLHEPFQHLVPTNVCDFYQNQNIWQVQYQAPl^ 
NQQQSFMNPMVEQHMNHVGGRESIPFVDRNYYNYNQLPAVDLASTSYM 

>G2137 (77.. 1123) 

GGGATTTGACTTTAGCACTTCAAAATCCAAAGCTAAAAGACAAAAAAGAATAGAGGTTCG 
ATTTGCATCTCCATTAATGGGCATCGATCTTTCTCTTAAGCTCGAGGCCGAGGAGAAAAA 
GAAAGAGATAGAAGGATCGAAACATAGCCGTGAGAACAAAGAAGACGAAGAACATGATGC 
TAGTGGTGATGAAGATGAACAAATGGTGAAAGAAGACGAAGATGATTCTTCTTCTTTAGG 
TTTAAGAACCCGAGAAGAAGAAAACGAACGTGAAGAGCTCTTGCAGCTACAGATCCAGAT 
GGAAAGTGTGAAAGAAGAGAATACTAGGTTGAGGAAGCTTGTCGAGCAGACTCTTGAAGA 
TTATCGTCATCTTGAGATGAAATTCCCGGTTATCGATAAAACCAAGAAGATGGATCTTGA 
AATGTTCCTTGGAGTACAAGGCAAACGATGTGTGGATATAACAAGTAAGGCTCGGAAAAG 
AGGAGCTGAGAGATCTCCGTCAATGGAAAGAGAAATAGGGCTTTCACTTTCTCTAGAGAA 
AAAACAGAAACAAGAAGAGAGCAAAGAAGCTGTTCAGTCTCATCACCAAAGATACAATAG 
TAGCAGCTTAGATATGAATATGCCACGTATCATTTCATCTTCTCAAGGTAATAGAAAGGC 
CAQGGTGTCCGTGAGGGCGAGATGTGAGACCGCAACAATGAATGATGGATGCCAATGGAG 
GAAGTACGGTCAGAAAACCGCGAAAGGGAATCCATGTCCTCGAGCTTATTACCGATGCAC 
CGTGGCTCCAGGATGTCCCGTTAGAAAACAGGTGCAAAGGTGTTTAGAAGACATGTCAAT 
ACTGATAACAACCTACGAAGGAACACATAACCATCCACTTCCGGTCGGAGCAACAGCCAT 
GGCTTCCACTGCCTCTACTTCTCCATTCTTGTTACTCGATTCCAGTGACAACCTCTCTCA 
TCCTTCCTATTACCAAACTCCTCAAGCCATAGACTCTTCTTTGATTACATACCCACAAAA 
TAGCAGCTACAACAATCGAACCATAAGAAGCTTGAACTTTGATGGTCCATCTAGAGGAGA 
TCACGTTTCATCTTCTCAAAACCGATTAAATTGGATGATGTAGAGTTTCCTATATCTCTA 
TGCTTGTTCTTTGGTCCCATTATTTGTCATTATGGATTCTTTGCCTTTCTTCTTGTTCTC 
GTTTCTAACATTTATGTTTCGTATA 
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>G2137 Amino Acid Sequence (conserved domain in AA coordinates : 109-168) 
MGIDLSLKLEAEEKKKEIEGSKHSRENKEDEEHDASG 

EENEREEIiLQLQIQMESVlCEENTRLRKLVEQTLEDYRHLEMKFPVIDKTKK^LEM 

QGK3^CVX)ITSKARKRGAERSPSMEREIGLSLSLEKKQKQEESKEAVQSHHQRYNSSSLDM 

NMPRIISSSQGNRKARVSVRARCETATMNDGCQWRKYGQKTAKGNPCPRAYYRCTVAPGC 

PVRKQVQRCLEDMSILITTYEGTHNHPLPVGATAMASTASTSPFLLLDSSDNLSHPSYYQ 

TPQAIDSSIilTYPQNSSYNNRTIRSLNFDGPSRGDHVSSSQNRLNWMM* 

>G746 (1. .1311) 

ATGGGTGAGGAGTTAGCTGACACAATGAACCTGGATTTGAATCTTGGGCCTGGTCCTGAG 

TCTGATCTCCAACCTGCACCAAACGAGACTGTGAATTTGGCTGATTGGACTAATGACCCG 

CCTGAGAGATCTTCTGAAGCTGTGACAAGGATCAGGACTCGGCATAGGACACGGTTCAGA 

CAGCTTAATCTCCCGATCCCGGTTCTATCTGAAACCCATACCATGGCTATAGAGCTCAAC 

CAGTTGATGGGAAATTCTGTAAATAGAGCTGCTATGCAGACTGGTGAGGGTAGTGAAAGA 

GGCAATGAGGATTTGAAAATGTGTGAGAATGGCGATGGAGCCCTTGGGGACGGTGTATTG 

GATAAGAAAGCGGATGTCGAGAAAAGCAGTGGCAGCGACGGTAACTTTTTCGATTGTAAT 

ATATGTTTGGATTTGTCGAAGGAGCCGGTTCTCACCTGTTGTGGTCATCTTTACTGTTGG 

CCTTGTCTGTACCAATGGTTACAAATTTCGGATGGAAAGGAATGTCCTGTTTGTAAAGGA 

GAGGTGACCTCCAAAACCGTGACACCGATCTATGGACGTGGAAACCACAAGAGAGAAATT 

GAAGAGAGTTTAGATACTAAGGTCCCCATGAGACCACACGCGAGACGCATTGAGAGCTTG 

AGGAATACAATTCAAAGGTCGCCTTTTACAATACCAATGGAAGAAATGATTAGACGTATA 

CAGAATAGGTTTGACAGGGATTCAAC C C CAGTCCCTGATTTTAGTAACCGAGAGGCATC A 

GAAAGAGTCAACGATCGAGCCAATTCGATCCTTAACCGGTTGATGACATCTAGGGGAGTT 

AGATCAGAGCAGAACCAGGCTAGTGCTGCAGCAGCAGCCATTGTCGCAGCATCAGAGGAT 

ATTGATCTAAATCCAAACATTGCTCCTGATCTTGAAGGAGAAAGCAACACGAGATTCCAT 

CCTCTCTTGATCAGGAGACAGTTACAGTCGCACCGAGTTGCAAGGATCTCGACTTTCACT 

TCTGCGTTGAGTTCAGCTGAGAGGCTTGTGGATGCGTATTTTAGGACTCATCCGTTGGGG 

AGGAACCACCAAGAGCAAAACCATCATGCTCCTGTTGTGGTTGATGATAGAGACTCATTC 

TCAAGCATTGCAGCTGTTATAAACTCTGAGAGTCAAGTGGATACTGCAGTTGAGATCGAT 

TCTATGGCTCTTTCGACATCGTCCTCGAGGAGAAGGAATGAGAATGGTTCGAGGGTTTCT 

GATGTAGACAGTGCAGATTCTCGTCCGCCTAGGAGAAGGAGATTTACTTGA 

>G746 Amino Acid Sequence (domain in AA coordinates: 139-178) 

MGEELADTMNIjDLNLGPGPESDIjQPAPNETv^^ 

QLNLPIPVXSETHTMAIELNQLMGNSVT^^ 

DKKADV^KSSGSDGNFFDmiCLDLSKEPVTiTC 

EVTSKTVTPIYGRGNHKI^IEESLDTKVPMRPHARRIESLRNTIQRSPFTIPl^EMIRRI 

QNRFDRDSTPVPDFSNREASERVOT)RANSILNRLMTSRGWSEQNQASAAAAAIVAASED 

IDLNPNIAPDLEGESNTRFHPIiIilRRQLQSHRVARISTFTSALSSAERLVDAYFRTHPIjG 

RiraQEQNHHAPWVDDRDSFSSIAAVINSES 

DVDSADSRPPRRRRFT* 

>G2701 (46.. 837) 

GTGTTTGTAGTTGAAACTTATTCTTCCCTTTTTTTGTTTTTAGGTATGGAGACTCTGCAT 

CCATTCTCTCACCTACCTATCTCTGACCACCGGTTCGTTGTTCAAGAGATGGTGAGCTTA 

CACAGCTCGAGTAGCGGTAGCTGGACTAAAGAAGAGAACAAGATGTTCGAACGAGCTCTT 

GCGATATACGCTGAAGACTCGCCTGATCGCTGGTTTAAAGTTGCTTCCATGATCCCTGGA 

AAGACTGTTTTTGATGTTATGAAGCAATATAGTAAGCTTGAAGAAGACGTTTTCGATATT 

GAAGCAGGACGTGTTCCCATTCCTGGTTATCCTGCAGCTTCTTCTCCCTTGGGGTTTGAC 

ACGGACATGTGTCGTAAACGGCCTAGTGGAGCTAGAGGATCTGATCAAGATCGAAAGAAA 

GGAGTCCCTTGGACAGAGGAAGAACACAGGAGATTCTTGTTAGGCCTTCTCAAGTACGGT 

AAAGGAGATTGGAGAAACATATCGAGAAACTTCGTGGTGTCAAAGACGCCAACGCAAGTG . 

GCGAGCCACGCCCAAAAGTATTACCAGAGACAGCTCTCCGGAGCCAAGGACAAACGCAGG 

CCAAGTATCCATGACATCACAACCGGCAATCTTCTCAATGCCAATCTCAACCGTTCCTTT 

TCCGATCATAGAGATATTCTCCCTGATTTAGGGTTTATCGATAAGGATGATACGGAGGAG 

GGAGTAATATTTATGGGTCAGAATCTCTCTTCAGAAAATCTGTTTTCTCCATCACCAACT 

TCATTCGAAGCTGCCATTAACTTCGCCGGAGAAAATGTCTTCAGTGCCGGAGCTTAAGGC 

AACATAGAATCCCCAAACTCAGCGGC 

>G27 01 Amino Acid Sequence (domain in AA coordinates: 33-81, 12 9-183) 
METLHPFSHLPISDHRFWQEIWSLHSSSSGSWTKEENKMFERALAIYAEDSPDRWFKVA 
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SMIPGKTVFDVMKQYSKLEEDVFDIEAGRVPIPGYPAASSPLGFDTDMCRKRPSGARGSD 
QDRKKGVPWTEEEHRRFLLGLLKYGKGDWRNISRNFWSKTPTQVASHAQKYYQRQLSGA 
KDIO^PSIHDITTGNLLNANIiNRSFSDHRDILPDLGFIDKDDTEEGVIFMGQlJTLSSENLF 
S P S PTS FE AAINFAGENVF SAGA* 
>G1819 (1..639) 

ATGGAAGAGAACAACGGCAACAACAACCACTACCTGCCGCAACCATCGTCTTCCCAACTG 
CCGCCGCCACCATTGTATTATCAATCAATGCCGTTGCCGTCATATTCACTGCCGCTGCCG 
TACTCACCGCAGATGCGGAATTATTGGATTGCGCAGATGGGAAACGCAACTGATGTTAAG 
CATCATGCGTTTCCACTAAC CAGG ATAAAGAAAATCATGAAGTCCAAC CCGGAAGTGAAC 
ATGGTCACTGCAGAGGCTCCGGTCCTTATATCGAAGGCCTGTGAGATGCTCATTCTTGAT 
CTCACAATGCGATCGTGGCTTCATACCGTGGAGGGCGGTCGCCAAACTCTCAAGAGATCC 
GATACGCTCACGAGATCCGATATCTCCGCCGCAACGACTCGTAGTTTCAAATTTACCTTC 
CTTGGCGACGTTGTCCCAAGAGACCCTTCCGTCGTTACCGATGATCCCGTGCTACATCCG 
GACGGTGAAGTACTTCCTCCGGGAACGGTGATAGGATATCCGGTGTTTGATTGTAATGGT 
GTGTACGCGTCACCGCCACAGATGCAGGAGTGGCCGGCGGTGCCTGGTGACGGAGAGGAG 
GCAGCTGGGGAAATTGGAGGAAGCAGCGGCGGTAATTGA 

>G1819 Amino Acid Sequence (domain in AA coordinates: 46-188) 

MEENNGNltfNHYIiPQPSSSQLPP 

HHAFPLTRIKKIMKSNPEVNMVTAEAPVL^ 

DTLTRSDISAATTRSFKFTFLGDWPRDPSWTDDPVLHPDGEVLPPGTVIGYPVFDCNG 

VYASPPQMQEWPAVPGDGEEAAGEIGGSSGGN* 

>G1227 (372 . .1451) 

TCTTCCGTGTGTTAACAGAAGTCCCCACAATTGTCTGTCTTCGCTGCGAGACAAAACTGC 
CACAGCCAATAATGTTTCTCTGAGGGACCTTGCTTCTGTCAGAGACTCGCTCTCTCTCTC 
CTCTTCTTGCTCTGCTCAGCTCTCTCACCAACTCATCTTCAGTCCTCAAACAAACATCTG 
TTCTCATCTTTGTTTTCTTTCCTTTCTTTCTCATATCTCATTTTCAATTTTCCCAATTTC 
TCTTCAACATCTT(^TAGCAATtTAAGACCACTATTCCATTATAAAGCTAACTGCTTTAG 
AAACTCCTCACATTTATTTCTTCCCCATCATTGTTTTAGAGAGGGAGAAAGAAAAAGAGC 
TCAGCTTTCTGATGGAGAGGAGTATTCAAGGACAAAACAAGCTCTGTTGTTTGGACCAAA 
AAGTGAATGTGAGAAGAAGCCTACAAGTTCAAGAAACTGTAGAGGATCATCAAAGCTTTG 
CCCTTGAAGAGGAAGAACAACAACTCTCAACTCCGAGCTTGCTGCAAGACACAACAATAC 
CATTTCTACAAATGCTGCAACAAAGTGAAGACCCTTCACCGTTTTTGTCATTCAAAGACC 
CAAGCTTTCTAGCACTACTATCTCTCCAGACACTTGAAAAGCCTTGGGAACTCGAAAACT 
ACCTCCCACATGAAGTTCCAGAGTTTCATTCACCGATCCATTCTGAAACCAACCACTACT 
ATCATAATCCATCTTTGGAAGGAGTC?^TGAAGCCATCTCAAACCAAGAACTTCCATTCA 
ACCCACTAGAGAATGCGCGTTCAAGACGCAAGCGGAAAAACAACAACTTGGCATCATTGA 
TGACAAGAGAAAAGCGAAAGAGAAGAAGAACTAAACCAACAAAGAACATAGAAGAGATAG 
AGAGTCAAAGAATGACACACATTGCGGTTGAACGAAACCGCAGACGCCAAATGAACGTTC 
ATCTGAACTCACTCCGCTCCATCATTCCATCTTCATACATCCAGAGGGGAGACCAAGCGT 
CAATAGTAGGAGGAGCAATAGACTTCGTAAAGATCCTAGAGCAACAGTTGCAATCCCTTG 
AAGCACAAAAGAGAAGTCAACAGAGTGATGATAACAAAGAGCAAATTCCAGAAGATAACA 
GTCTCAGGAACATTTCGTCGAACAAGTTGCGTGCGAGTAATAAAGAAGAACAAAGTAGCA 
AACTCAAAATCGAAGCCACAGTGATAGAGAGTCACGTCAACCTAAAAATTCAATGTACGA 
GGAAACAAGGACAACTTCTCAGATCAATCATATTGCTGGAGAAACTTCGATTCACTGTTC 
TTCATCTCAACATCACATCTCCGACCAATACATCTGTCTCTTATTCCTTCAACCTCAAGA 
TGGAAGATGAATGTAATTTGGGATCAGCGGATGAGATAACGGCGGCGATTCGTCAGATTT 
TCGACAGCTGATTGACTAATCCAAGTAAAAAGTAAAATAAAAAAAGAAACGTTTACTTTG 
GTAACTTCGTTTTGATGATTAAATTCTTTATTTGGTCGTATGTGATTGGAGTCTTCTCGG 
CATGGAACTTGACTTTGGTTTTAGGGTACTAGTCTCTACAGAAGCTGTGGTCCTTCTTTG 
GATGC 

>G1227 Amino Acid Sequence (domain in AA coordinates: 183-244) 
MERSIQGQNKLCCLDQKV1JVRRSLQVQETVEDHQSFALEEEEQQLSTPSLLQDTTIPFLQ 
MLQQSEDPSPFLSFKDPSFLALLSIjQTIjEKPWELENYIjPHEVPEFHSPIHSETNHYYHNP 
SLEGVNEAISNQELPFNPLENARSRRKRKNNlsn^S 

MTHIAVERlSIRRRQMNVHLtNSIjRS I IPSS YIQRGDQAS I VGGAIDFVKILEQQLQSLEAQK 
R SQQSDDNKEQIPEDNSLRNISSNKLRASNKEEQSSKLKIEATVIESHVNLKIQCTRKQG 
QIiLRS 1 1 LLEKLiRFTVXjHLN ITS PTNTSVSYS FNLKMEDECNLGSADE ITAAIRQ I FDS * 
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>G2417 (118. .1311) 

CATACCGGTGGAAGATTCTGCTTTACTACGCTCTCCGCTTCTTCTTCTCCTCGATTCGAT 
TCTCCTCATGGGTTTATCATGAATTTTTAGGTTTTGAGTAATTCAGAAACTCGAGTGATG 
ATCCCGAATGATGATGATGATGCAAATTCTATGAAGAATTATCCGTTAAATGATGATGAT 
GCAAATTCTATGAAGAATTATCCGTTAAATGATGATGATGCAAATTCTATGGAGAATTAT 
CCGTTAAGGTCAATTCCGACGGAGCTTTCACACACTTGTTCATTGATACCACCTTCTTTA 
CCAAACCCTTCAGAAGCAGCAGCAGACATGTCCTTCAATTCAGAACTCAATCAAATCATG 
GCAAGGCCTTGTGATATGCTCCCTGCCAATGGTGGAGCTGTTGGTCATAACCCTTTTTTG 
GAACCAGGATTCAACTGCCCCGAGACAACAGATTGGATTCCCTCTCCACTCCCCCATATT 
TATTTTCCTTCGGGTTCTCCCAATCTAATAATGGAGGATGGTGTCATTGATGAGATTCAC 
AAACAAAGTGACTTGCCACTTTGGTATGACGACTTGATTACCACTGATGAAGATCCACTC 
ATGTCTAGTATCTTGGGCGATCTTCTCCTTGACACTAATTTCAACTCAGCTTCAAAGGTG 
CAGCAACCAAGTATGCAATCGCAGATTCAACAACCCCAAGCTGTTCTGCAGCAGCCTTCT 
TCTTGTGTGGAATTGCGCCCACTTGATAGGACAGTATCCTCAAACAGCAACAACAATAGC 
AACAGTAATAATGCAGCAGCAGCAGCTAAGGGACGTATGCGTTGGACGCCTGAACTTCAT 
GAGGTTTTTGTTGACGCTGTTAACCAGCTCGGTGGCAGTAATGAAGCAACTCCTAAAGGT 
GTCCTGAAGCATATGAAAGTCGAAGGTTTGACTATTTTTCATGTCAAAAGTCATTTGCAG 
AAATATAGAACAGCTAAATATATACCAGTACCATCAGAAGGTTCGCCGGAGGCAAGGTTG 
ACACCGCTTGAGCAAATTACATCTGATGATACGAAACGTGGGATAGATATCACTGAGACT 
CTGCGAATTCAGATGGAACATCAGAAGAAACTGCATGAGCAGCTTGAGAGTCTAAGAACA 
ATGCAACTTCGGATAGAAGAGCAAGGAAAGGCGCTGTTGATGATGATTGAGAAGCAAAAT 
ATGGGTTTCGGCGGACCAGAACAAGGAGAGAAAACAAGTGCGAAAACGCCTGAAAATGGT 
TCAGAGGAGTCGGAATCCCCGCGGCCAAAGCGTCCGAGAAATGAAGAATGAAGGAAACCT 
TTCTTCGGATGGTAGATCATAAAACTGTGGTTTTGGTGGAGTTGTAGAGTATGACTTATT 
AGGAGTAGAGCTTTCAGTCTTCTTCAGGC 

>G2417 Amino Acid Sequence (domain in AA coordinates: 235-285) 

MI PITODDDANSMK^PLI^DDAN^^ 

LPNPSEAAADMSFNSELNQXMARPCDMLPAN^ 

IYFPSGSPNLI^DGVIDEIHKQSDIiPLWYDDLITTDEDPLMSSILGDLLIiDTNFNSASK 

VQQPSMQSQIQQPQAvTjQQPSSCV^LRPLDRTVSSNStt^ 

HEVFVDAWQLGGSNEATPKGVXiKHMKVEGLTIFHVKSHIiQKYRTAKYI 

LTPLEQITSDDTKRGIDITETLRIQMEHQKKLHEQLESLRTMQLRIEEQGKALLMMIEKQ 

NMGFGGPEQGEKTSAKTPENGSEESESPRPKRPRNEE* 

>G2116 (104. .1117) 

TTCATCTCCATCATTATCTCCATTGACATTGTTCTCAATTGCGAATAATAATCATAATTA 
TTCACACAACCAAAGCATTCATCTCTCAGATTCTCTTAAAAAAATGGAGAAATCAGATCC 
TCCACCAGTCCCAAAGCCCGGCGCCACTATTATCCCCTCCTCCGATCCAATTCCTAATGC 
CGATCCGATTCCATCTTCTTCCTTCCACCGCCGATCTCGCTCCGACGATATGTCCATGTT 
CATGTTCATGGATCCCCTCTCCTCCGCCGCACCACCTTCCTCCGACGACCTTCCCTCCGA 
CGACGATCTCTTCTCTTCTTTCATCGATGTCGATAGCCTCACCTCTAATCCCAATCCCTT 
TCAAAATCGTTCCCTCTCCTCCAACTCCGTTTCCGGCGCTGCTAATCCTCCTCCTCCTCC 
TTCCTCTCGTCCTCGCCACCGTCACAGCAATTCCGTTGACGCTGGATGCGCCATGTATGC 
CGGTGATATCATGGACGCTAAGAAAGCTATGCCTCCTGAAAAACTCTCTGAGCTTTGGAA 
CATCGATCCCAAACGCGCCAAAAGGATTCTAGCGAATCGACAATCTGCAGCTCGATCCAA 
AGAGAGAAAAGCTCGATACATTCAAGAACTTGAGCGCAAAGTTCAATCTCTTCAAACCGA 
AGCTACCACTCTCTCTGCTCAGCTTACTCTCTACCAGAGAGACACAAATGGACTAGCAAA 
CGAAAACACAGAGCTGAAACTTAGGTTGCAAGCAATGGAACAACAAGCTCAGCTTCGTAA 
TGCTTTAAACGAAG€GTTGAGGAAAGAAGTTGAAAGGATGAAGATGGAGACAGGAGAAAT 
CTCTGGTAATTCAGATTCGTTTGATATGGGAATGCAGCAGATTCAGTATTCTTCCTCAAC 
TTTCATGGCTATTCCACCATATCATGGCTCAATGAACCTCCATGATATGCAGATGCATTC 
TAGTTTCAATCCTATGGAGATGTCCAATTCTCAAAGCGTGTCGGACTTTCTACAGAACGG 
CCGAATGCAAGGGCTGGAGATTAGTAGCAATAGCTCAAGCTTAGTCAAATCTGAAGGACC 
TTCTCTCTCTGCTAGTGAGAGTAGCTCTGCCTATTGACGACAAGATTATGATGAGGCTCA 
TTTTTCTG 

>G2116 Amino Acid Sequence (conserved domain in AA coordinates : 150-210) 

MEKSDPPPVPKPGATIIPSSDPIPNADPIPSSSFHRRSRSDDMSMFMFMDPLSSAAPPSS 

DDIiPSDDDLFSSFIDVDSLTSNPNPFQNPSLSSNSVSGAANPPPPPSSRPRHRHSNSVDA 
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GCAOTraGDIMDAKKAM^ 
QSLQTEATTIjSAQLTLYQRDTNGLA^^ 

METGE I S GNSDS FDMGMQQ I QYS S STFMAI PP YHGSMNIiHDMQMHS S FNPMEM SNS QS VS 

DFLQNGRMQGLEISSNSSSIiVKSEGPSLSASESSSAY* 

>G647 (1. .948) 

ATGATGATCGGCGAAAATAAAAACCGGCCACATCCAACGATCCATATCCCTCAATGGGAT 

CAAATCAACGATCCAACGGCCACAATCTCTTCACCATTCTCTTCCGTCAACCTTAACAGC 

GTTAACGACTACCCACACTCTCCGTCACCGTATCTCGACTCCTTCGCTTCTCTCTTCCGT 

TACCTCCCGTCAAACGAGTTAACAAACGATTCAGACTCATCAAGTGGCGACGAGTCATCA 

CCACTCACCGACTCATTCTCCTCCGACGAGTTTCGCATCTACGAGTTCAAAATCCGGCGA 

TGCGCTCGAGGTCGATCTCATGATTGGACGGAGTGTCCGTTCGCACATCCCGGAGAAAAA 

GCTCGACGACGTGATCCGAGAAAGTTTCATTACTCCGGCACCGCTTGTCCTGAGTTTCGT 

AAAGGAAGTTGTAGAAGAGGTGATTCGTGTGAGTTCTCTCATGGAGTTTTCGAGTGTTGG 

CTCCATCCTTCTCGTTACCGTACTCAGCCGTGTAAAGACGGAACTAGCTGCCGGAGAAGA 

ATCTGTTTCTTCGCTCATACGACGGAGCAGTTACGTGTATTACCTTGTTCGTTAGATCCA 

GATCTTGGATTCTTCTCAGGATTAGCTACTTCTCCGACTTCGATTCTTGTTTCTCCTTCG 

TTTTCACCACCGTCGGAATCTCCGCCGCTTTCTCCGAGTACCGGTGAACTTATTGCGTCG 

ATGAGGAAAATGCAATTGAACGGAGGTGGTTGTTCGTGGAGTTCTCCGATGAGATCTGCA 

GTTAGGTTACCTTTTTCGTCGTCTCTGCGTCCGATTCAGGCGGCAACGTGGCCGAGGATA 

AGAGAGTTTGAGATCGAAGAAGCTCCGGCGATGGAATTTGTGGAATCTGGGAAAGAGCTG 

AGAGCGGAGATGTATGCAAGACTCAGTAGAGAGAACTCACTCGGTTGA 

>G647 Amino Acid Sequence {domain in aa coordinates: 77-192) 

MMIGENKNRPHPTIHIPQWDQINDPTATISSPFSSV^ 

YLP SNELTNDS DS S SGDES S PLTDSFS SDEFRI YEFKIRRCARGRSHDWTECPFAHPGEK 

ARRRDPRKFHYSGTACPEFRKGSCRRGDSCEFSHGVFECWLHPSRYRTQPCKDGTSCRRR 

ICFFAHTTEQLRVLPCSLDPDLGFFSGLATSPTSILVSPSFSPPSESPPLSPSTGELIAS 

MRKMQLNGGGCSWSSPMRSAVRLPFSSSLRPIQAATWPRIREFEIEEAPAMEFVESGKEL 

RAEMYARLSRENSLG* 

>G974 (377.. 1162) 

AAAAAAAAAGTTGATATACTTTCTGGTTTTCTCCTTAACTTTTATTCTTTACAAATCCAT 

CCCCCTTAGATCTGTTTATTTCCCGCTACTTTGATTCATTTCTGTTAGTAATCTGTCTTT 

CGTATAGAAGAAAACTGATTTCTTGGTTTGTATTTTCTTAAAGAGATCAATCTTTTTTTA 

TTTTTGATCTTCTTGTGTTTTTTTTTCTTTGTAGAATTAATCGTTTGTGAGGGTATTTTT 

TTAATTCCCTCCTCTGAGAAATCTACACAGAGGTTTTTTATTTTATAAACCTCTTTTTCG 

ATTTTCTTGAAAACAAAAAATCCTGTTCTTTACTTTTTTTACAAGAACAAGGGAAAAAAA 

TTTCTTTTTATTAGAAATGACAACTTCTATGGATTTTTACAGTAACAAAACGTTTCAACA 

ATCTGATCCATTCGGTGGTGAATTAATGGAAGCGCTTTTACCTTTTATCAAAAGCCCTTC 

CAACGATTCATCCGCGTTTGCGTTCTCTCTACCCGCTCCAATTTCATACGGGTCGGATCT 

CCACTCATTTTCTCACCATCTTAGTCCTAAACCGGTCTCAATGAAACAAACCGGTACTTC 

CGCGGCTAAACCGACGAAGCTATACAGAGGAGTGAGACAACGTCACTGGGGAAAATGGGT 

GGCTGAGATTCGTTTACCGAGGAATCGAACTCGACTTTGGCTCGGAACATTCGACACGGC 

GGAGGAAGCTGCTTTAGCTTATGACAAGGCGGCGTATAAGCTCCGAGGAGATTTTGCGCG 

GCTTAATTTCCCTGATCTCCGTCATAACGACGAGTATCAACCTCTTCAATCATCAGTCGA 

CGCTAAGCTTGAAGCTATTTGTCAAAACTTAGCTGAGACGACGCAGAAACAGGTGAGATC 

AACGAAGAAGTCTTCTTCTCGGAAACGTTCATCAACCGTCGCAGTGAAACTACCGGAGGA 

GGACTACTCTAGCGCCGGATCTTCGCCGCTGTTAACGGAGAGTTATGGATCTGGTGGATC 

TTCTTCGCCGTTGTCGGAGCTGACGTTTGGTGATACGGAGGAGGAGATTCAGCCGCCGTG 

GAACGAGAACGCGTTGGAGAAGTATCCGTCGTACGAGATCGATTGGGATTCGATTCTTCA 

GTGTTCGAGTCTTGTAAATTAGATGTTGCCATAGGGGTATTTTAGGGACTTTAGAGCTCT 

CTGCGATGGAGTTTTTGGTCATTGCAGAGATTTTATTATTATTAAGGGGGTTTGTTATGT 

TAATATCAAATAAGTTTATCTACTTTGATGTTAATTAGTGTTAATCTCTGCGTCGGTCCA 

AGCTGTTTTTTTTTGGCATGCTTCGACCGTGTGAGATTTCTTATGTAATTTTTGTAGTTC 

CTTGATTTTCTTAGTTCAAGTTAAATTGGCACAAAAAAAAAAAAAAAAAA 

>G974 Amino Acid Sequence (domain in AA coordinates: 81-140) 

MTTS^FYSNKTFQQSDPFGGELMEALLPFIKSPSNDSSAFAFSLPAPISYGSDLHSFSH 

HLSPKPVSMKQTGTSAAKPTKLYRGVRQRHWGKWAEIRLPRNRTRLWLGTFDTAEEAAL 

AYDKAAYKLRGDFARLNFPDLRHNT)EYQPLQS SVDAKliEAI CQNLAETTQKQVRSTKKS S 
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SRKRSSTVAVKLPEEDYSSAGSSPLLTESYGSGGSSSPLSELTFGDTEEEIQPPWNENAL 

EKYPSYEIDWDSILQCSSLVN* 

>G1419 (27.. 692) 

GAAGACTCCAACATAATTCATCATCTATGGCTTCTTCACATCAACAACAGCAAGAACAAG 
ACCAGTCAGCTTTAGATCTCATAACCCAACACCTTCTTACTGATTTCCCTTCCTTAGACA 
CCTTTGCCTCCACCATCCACCACTGCACCACCTCAACTCTAAGCCAACGCAAACCACCTC 
TTGCCACTATAGCAGTTCCTACTACTGCACCGGTGGTTCAAGAGAATGATCAAAGGCATT 
ACAGAGGCGTCAGGAGAAGACCATGGGGTAAGTATGCGGCTGAGATCAGAGACCCAAACA 
AGAAAGGTGTTCGTGTCTGGTTAGGCACTTTTGACACAGCCATGGAAGCTGCAAGAGGTT 
ATGACAAGGCAGCTTTTAAACTACGAGGAAGCAAAGCTATTCTTAACTTCCCACTTGAAG 
CAGGAAAGCATGAGGACTTGGGAGACAACAAGAAGACTATTTCTTTAAAAGCAAAGAGGA 
AGAGACAGGTGACGGAGGATGAAAGCCAGCTGATCAGCCGTAAAGCTGTTAAGAGGGAAG 
AAGCTCAGGTTCAGGGTGATGCTTGTCCATTAACGCCATCAAGTTGGAAGGGGTTTTGGG 
ACGGAGCAGACAGTAAAGACATGGGAATATTTTCCGTGCCTCTGTTATCTCCTTGTCCAT 
CTCTTGGACACTCTCAACTCGTAGTTACTTAAGCTTCAGAGGGTCAAACTGGAAAAAATC 
AACATTGGATTGTTTTCAAAGCTTCTAGATTAGCTGATTGTAAAAAAATGTTTTACTATA 
TTCATTCATTCTTCTTAAATGCAATTCTTTCTACCCTTCC 

>G1419 Amino Acid Sequence (domain in AA coordinates: 69-137) 

MASSHQQQQEQDQSALDLITQHLLTDFPSLDTFASTIHHCTTSTLSQRKPPLATIAVPTT 

APWQENDQRHYRGVRRRPWGKYAAEIRDPNKKGVRWLGT^ 

GSKAILNFPLEAGKHEDLGDNKKTI SLKAKRKRQVTEDE SQL I SRKAVKREEAQVQADAC 

PLTPSSWKGFWDGABSKDMGIFSVPLLSPCPSLGHSQLWT* 

>G1634 (22.. 855) 

TTATCTCGTAGCCTTTAAACGATGGAGACTCTGCATCCACTACTCTCGCACGTGCCAACT 

TCTGACCACCGGTTTGTAGTTCAAGAGATGATGTGCTTGCAAAGCTCGAGCTGGACTAAA 

GAAGAGAACAAGAAGTTTGAGCGAGCTCTTGCTGTCTACGCTGATGACACGCCTGATCGC 

TGGTTCAAAGTTGCTGCTATGATCCCTGGAAAGACCATATCAGATGTCATGAGGCAATAC 

TCTAAGCTTGAAGAAGACCTCTTCGATATCGAAGCAGGAGTTGTCCCGATCCCGGGTTAC 

CGTTCAGTTACTCCTTGTGGATTTGATCAGGTTGTGAGTCCACGTGACTTTGATGCGTAT 

CGTAAACTTCCTAATGGAGCCAGAGGATTTGATCAAGACCGTAGGAAAGGAGTTCCATGG 

ACGGAGGAAGAACACAGGAGATTCTTGTTAGGGCTTCTCAAGTATGGGAAAGGAGATTGG 

AGAAACATATCGAGGAACTTTGTGGGATCAAAAACACCAACTCAGGTTGCAAGTCATGCC 

CAAAAGTACTACCAAAGACAGCTTTCCGGTGCGAAAGACAAACGACGGCCTAGCATTCAC 

GACATCACCACCGTCAATCTTCTCAATGCCAATCTTAGCCGTCCATCGTCTGATCACGGT 

TGCTTAGTCTCAAAACAGGCCGAGCCGAAACTAGGGTTCACCGACAGGGATAATGCAGAG 

GAGGGAGTTATGTTTCTTGGTCAGAATCTATCCTCGGTCTTCTCTTCCTACGATCCTGCC 

ATTAAGTTTTCCGGAGCAAATGTTTACGGTGAAGGAGGTTACTGTATCTCACAAGATCTT 

GAAACGAGAAAATGAGAATTTTGAAATTTTAACTATTGCAACGAAACCATAATTGC 

>G1634 Amino Acid Sequence '(domain in AA coordinates: 129-180) 

METLHPLLSHVPTSDHRFWQEMMCLQSSSWTK^ 

IPGKTISDVMRQYSKLEEDLFDIEAGLVPIPGYRSVTPCGFDQVVSPRDFDAYRKLPNGA 

RGFDQDRRKGVPWTEEEHRRFLLGLLKYGKGDWRNISRNFVGSKTPTQVASHAQKYYQRQ 

LSGAKDKRRPSIHDITTVNLIiNANLSRPSSDHGCLVSKQAEPKLGFTDRDNAEEGVM 

QNLSSVFSSYDPAIKFSGANVYGEGGYCISQDLETRK* 

>G1637 (1..954) 

ATGGTGAAGGAGACGGTGACGGTGGCGAAAACGTGCTCACACTGTGGCCATAATGGCCAT 
AACGCACGGACTTGTCTCAACGGCGTTAATAAGGCAAGTGTTAAACTGTTCGGCGTTAAT 
ATATCGTCTGATCGGATTAGGCCGCCTGAGGTAACGGCGTTAAGGAAGAGTCTTAGTTTG 
GGAAACCTTGATGCTCTTCTCGCTAACGATGAAAGTAACGGTAGCGGTGATCCTATCGCC 
GCCGTTGATGATACCGGTTATCATTCCGATGGTCAGATTCATTCCAAGAAGGGTAAAACT 
GCTCATGAGAAGAAAAAGGGGAAGCCATGGACGGAAGAAGAACATCGTAATTTCTTAATC 
GGTTTAAACAAACTCGGAAAAGGAGATTGGAGAGGCATTGCAAAGAGTTTCGTGTCGACA 
AGAACACCAACACAAGTCGCAAGTCATGCTCAGAAATATTTTATTAGGTTAAACGTTAAC 
GACAAGAGAAAAAGACGTGCTAGTCTCTTTGACATCTCTCTCGAAGATCAGAAGGAGAAA 
GAGAGGAACTCTCAAGATGCTTCAACAAAGACTCCACCTAAACAACCAATAACCGGAATT 
CAACAACCGGTAGTACAAGGTCATACTCAAACCGAGATTTCGAACAGGTTTCAGAATTTA 
TCAATGGAGTATATGCCAATCTACCAACCCATACCACCTTACTACAACTTTCCACCTATT 
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ATGTACC^TCCAAATTATCCAATGTACTATGCCAACCCTCAAGTACCGGTTAGGTTTGTT 

CATCCTTCTGGTATACCTGTTCCAAGACATATACCGATTGGTTTGCCTCTGTCTCAACCG 

AGTGAAGCTTCTAATATGACAAATAAAGACGGTTTGGATCTTCATATCGGTTTGCCTCCA 

CAAGCTACTGGAGCTTCTGACTTGACTGGTCATGGCGTTATTCATGTGAAATGA 

>G1637 Amino Aoid Sequence (domain in AA coordinates: 109-173) 

MVKETVTVACTCSHCGHNGHNARTCI^ 

GNIiDALIiANDESNGSGDPIAAVDDTGYHSDGQIHSKKGKTAHEKKKGKPWTEEEHRNFLI 
GLNKXiGKGDWRGIAKSFVSTRTPTQVASHAQKYFIRLNV^ 

ERNSQDASTKTPPKQPITGIQQPWQGHTQTEISNRFQNLSMEYMPIYQPIPPYYNFPPI 
MYHPNYPMYYANPQVPVRFVHPSGIPVPRHIPIGLPLSQPSEASNMTN^ 
QATGASDLTGHGVIHVK* 
>G1818 (601. .1161) 

TAACAAATCAAATAATTAGAGAAATAACCAAAATTTAACTTTTAGAGGGACTACAGGATT 
TGTACTTTGTACATTCATATATTATTGTTATATATCGTTTCATACATTAATTTGAACCAA 
TGTAAATTAAGTAAAATTCAATTTAACATCATGAGCAAATTCTTATTAAAATTCTCTTAA 
AATTTTGAGCAAATTATGCTTTCACATTTAACATTTC 

TTCAAAACTAAGTTTTGTACAGCAAAATTTTAACTTTCAATTTTATAGAGAAAAAGGTAT 

TTTTTTTTTTGTTTCATTTTTATAAGACTATTATTTGGTATATAATATACACTTTAAGTA 

AAAACAAATCTCTTTCTTTTTTCTTCTTATAATACCAACCACAAGTCTGTCAGTCACACA 

CATACAGTTAATAACATTAAATATTCTTAACAAACTACTAAATAGGTTGAGATTCATATA 

TGTAAAGAGATGACTTCTTAATCTTATCCTACCATATCTTATATACGCTTAATTTTCCTT 

TATATATGCAAACCTCCACATAAAAATATCTCAAACCCAAACACTTCAAACAAAAAAAAA 

ATGGAGAACAACAACAACAACCACCAACAGCCACCGAAAGATAACGAGCAACTAAAGAGT 

TTCTGGTCAAAGGGGATGGAAGGTGACTTGAATGTCAAGAATCACGAGTTCCCCATCTCT 

CGTATCAAGAGGATAATGAAGTTTGATCCGGATGTGAGTATGATCGCTGCTGAGGCTCCA 

AATCTCTTATCTAAGGCTTGTGAAATGTTTGTCATGGACCTCACGATGCGTTCATGGCTC 

CATGGTCAAGAGAGCAACCGACTCACGATACGGAAATCTGATGTTGATGCCGTAGTGTCT 

CAAACCGTCATCTTTGATTTCTTGCGTGATGATGTCCCTAAGGACGAGGGAGAGCCCGTT 

GTCGCCGCTGCTGATCCTGTGGACGATGTTGCTGATCATGTGGCTGTGCCAGATCTTAAC - 

AATGAAGAACTGCCGCCGGGAACGGTGATAGGAACTCCGGTTTGTTACGGTTTAGGAATA 

CACGCGCCACACCCGCAGATGCCTGGAGCTTGGACCGAGGAGGATGCGACTGGGGCAAAT 

GGAGGAAACGGTGGGAATTAATATTTGGATTGGGTTTTGTAACCGCTGTTGTGAGAACTT 

TTTCTGTTGTTTTGTCCAAAAAAAAAAAAGAATGTATTTCTGTTGTTGTCTTTCAAATGA 

ATCTAATGGTTTATGAATATTGGCTTTAGATTAATTTATGCATACAAAAACACAAGGATT 

ACGGATAAAAAAGTCCTCAGTTTACCCATGGAAACATAATCTTCTAGTGATTCCTTATGA 

GAGTAGAAAAGAATCATATATTATAATCTATTTCATAAGAGATAGGGTACTGTAAACAAG 

GATGTTTATTCGGCTATTTCTTTTTTTTTTAATCACTTTTACTTGTCAAGACTCTTTTGT 

GTTTGCAGCTTTTTGTTAGATTACATTCTAGAGGCAACAAGATCCAGAGATCTAGCAAAA 

AAAACTTATTTTGAAACCTGAATCTATTTTAAAAATTTTCCAACTCATTTTTCGTTCTTA 

TTCTTTGTTTTCCAACGGAATTTGGCGCACAAACGATTTATTTGAATTTTGTCTTTCAAG 

>G1818 Amino Acid Sequence (domain in AA coordinates: 36-113) 

MENNWNNHQQPPKDNEQLKS FW S KGMEGDLWKimEFPISRIKRIMKFDPDVSMIAAEAP 

NLLSKACEMFVMDLTMRSWLHAQESNRLTIRKSDTO 

VAAADPVDD VADHVAVPDLNNEELiPPGTVI GTPVC YGLG IHAPHPQMPGAWTEEDATGAN 
GGNGGN* 

>G1820 (1..609) 

ATGGCTGAGAACAAGAACAACAACGGCGACAACATGAACAACGACAACCACCAGC7UVCCA 
CCGTCGTACTCGCAGCTGCCGCCGATGGCATCATCCAACCCTCAGTTACGTAATTACTGG 
ATTGAGCAGATGGAAACCGTCTCGGATTTCAAAAACCGTCAGCTTCCATTGGCTCGAATT 
AAGAAGATCATGAAGGCTGATCCAGATGTGCACATGGTCTCCGCAGAGGCTCCGATCATC 
TTCGCAAAGGCTTGCGAAATGTTCATCGTTGATCTCACGATGCGGTCGTGGCTCAAAGCC 
GAGGAGAACAAACGCCACACGCTTCAGAAATCGGATATCTCCAACGCAGTGGCTAGCTCT 
TTCACCTACGATTTCCTTCTTGATGTTGTCCCTAAGGACGAGTCTATCGCCACCGCTGAT 
CCTGGCTTTGTGGCTATGCCACATCCTGACGGTGGAGGAGTACCGCAATATTATTATCCA 
CCGGGAGTGGTGATGGGAACTCCTATGGTTGGTAGTGGAATGTACGCGCCATCGCAGGCG 
TGGCCAGCAGCGGCTGGTGACGGGGAGGATGATGCTGAGGATAATGGAGGAAACGGCGGC 
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GGAAATTGA 

>G1820 Amino Acid Sequence (domain in AA coordinates: 70-133) 
IVLAENN1INNGDNMNNDNHQQ 

KKIMKADPDVHMVS AEAP X I FAKACEMF I VDLTMRS WLKAEENKRHTLQKSD I SNAVAS S 
FTYDFLLDWPKDESIATADPGPVAMPHPDGGGVPQYYYPPGVVMGTPMVGSGMYAPSQA 
WPAAAGDGEDDAEDNGGNGGGN* 
>G1903 (1 . . 1200) 

ATGTCTAAATCTAGAGATACGGAGATAAAGTTGTTTGGGAGGACAATCACATCTCTTTTA 
GATGTGAATTGTTATGATCCGTCGTCGTTGTCCCCTGTTCACGATGTTTCTTCTGATCCA 
AGCAAGGAGGATTCGTCTTCTTCTTCATCTTCTTGTTCTCGAACTATTGGACCAATCAGG 
GTTCCGGTTAAAAAAAGTGAGCAAGAGAGTAACAAATTCAAAGATCCATATATATTATCC 
GATCTAAACGAACCACCAAAAGC^GTATCTGAGATTTCATCACCAAGAAGTTCCAAGAAC 
AACTGTGATCAACAGAGCGAGATCACAACAACAACTACCACAAGTACTACATCAGGAGAG 
AAATCAACGGCTCTCAAGAAACCGGACAAGCTTATTCCATGTCCTAGATGTGAAAGCGCA 
AACACCAAATTCTGTTATTACAACAACTACAACGTGAACCAGCCACGTTACTTCTGCAGG 
AACTGTCAGAGGTATTGGACAGCTGGTGGATCTATGAGGAACGTTCCTGTTGGCTCAGGT 
CGTCGCAAGAACAAAGGATGGCCTTCTTCAAACCATTACTTGCAAGTCACTTCTGAGGAT 
- TGTGATAATAATAACTCGGGGACGATCCTTAGTTTCGGTTCTTCGGAGTCTTCGGTTACA 
GAGACTGGTAAGCATCAGTCAGGTGATACAGCAAAGATAAGTGCTGATTCAGTTTCTCAA 
GAAAATAAAAGCTACCAAGGGTTTCTTCGTCCGCAAGTAATGTTACCTAATAATTCTTCT 
CCTTGGCCTTACCAATGGAGTCCAACGGGTCCTAACGCTAGTTTCTACCCTGTCCCCTTC 
TACTGGGGATGCACGGTTCCGATATACCCTACCTCAGAGACTTCATCATGTTTAGGAAAA 
CGGTCAAGAGATCAAACTGAAGGAAGAATCAATGATACTAATACAACAATAACTACTACA 
AGAGCAAGATTGGTCTCAGAATCTCTTAGAATGAATATCGAAGCTAGTAAGAGCGCTGTG 
TGGTCTAAGTTACCGACAAAACCCGAGAAAAAAACGCAAGGATTCAGTTTGTTCAATGGA 
TTTGACACAAAGGGAAACAGCAACAGAAGTAGCTTGGTCTCCGAAACTTCTCACAGTCTA 
CAAGCAAACCCTGCAGCGATGTCTAGAGCTATGAACTTCAGGGAGAGCATGCAACAATAA 
>G1903 Amino Acid Sequence (domain in AA coordinates: 134-180) 
MSKSRDTEIKLFGRTITSLLDVWCYDPSSLSPVHDVSSDPSKEDSSSSSSSCSPTIGPIR 
VPVICKSEQESNKFKDPYILSDLNEPP 
KSTALKKPDKLIPCPRCESANTKFCYYimYNW 
RI^KNKGWPSSNHYLQVTSEDCDNl^ 

ENKSYQGFLPPQVMLPNNSSPWPYQWSPTGPNASFYPVPFYWGCTVPIYPTSETSSCLGK 
RSRDQTEGRIttTOTNTTITTTRARLVSESLRIV^ 
FDTKGNSNRSSLVSETSHSLQANPAAMSRAMNFRESMQQ* 
>G371 (1. .582) 

ATGGAGATTGAGAAGGATGAGGACGACACAACATTGGTTGATTCTGGAGGAGACTTCGAC 
TGCAACATATGTTTGGATCAGGTTCGAGACCCGGTCGTGACTTTATGTGGCCACCTGTTT 
TGTTGGCCCTGCATTCACAAGTGGACTTATGCGTCCAACAATTCAAGACAACGAGTCGAT 
CAATACGATCATAAGAGGGAACCACCAAAATGTCCGGTATGCAAATCTGATGTCTCCGAG 
GCTACGCTTGTCCCGATCTACGGACGAGGACAGAAAGCTCCCCAGTCCGGTTCAAATGTA 
CCGAGCAGACCAACTGGTCCGGTTTATGACTTAAGAGGAGTTGGTCAACGTTTAGGAGAA 
GGGGAGAGTCAACGTTACATGTATAGAATGCCTGATCCGGTGATGGGTGTGGTATGCGAA 
ATGGTATACCGGAGACTATTTGGAGAGTCTTCGAGCAACATGGCACCTTACCGCGATATG 
AATGTCCGGTCTAGGCGACGGGCAATGCAGGCTGAGGAGTCATTAAGCAGAGTCTACTTG 
TTTCTACTTTGCTTCATGTTTATGTGTCTATTTCTCTTCTAA 

>G371 Amino Acid Sequence (domain in aa coordinates: 21-74) 

MEIEKDEDDTTLVDSGGDFDCNICLDQVRDPVVTLCGHLFCWPCIHKWTYASl^SRQRV^ 

QYDHia^PPKCPVCKSDVSEATLVPIYGRGQKAPQSGSNVPSRPTGPVYDLRGVGQRLGE 

GESQRYMYRMPDPVMGWCEMVYRRLFGESSSNMAPYra 

FLLCFMFMCLFLF* 

>G597 (255.. 1310) 

AAAATTCTCCTGTAAAATTTAATATTATAAAAGTGGTTTCTTTTTCATTTATGTTTATAT 
AATTTTCATCTTTAATCTTAAATTCTGGTAACCTTAATGCGCGATCCGCTTTTCTAAAGT 
TTTGTGAGAGAGAAGAGATCTAAAAAAATCCACAATTTTGTTCAAATCTTGGAGTTAAAT 
GCTGAATTTTAGGCCTTGTTGCTTAGATTTATGGCTTAAAGTTTCAAACTTTTCATTGGA 
TATGTGAGAAGAAAATGTCAGGATCTGAGACGGGTTTAATGGCGGCGACCAGAGAATCAA 
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TGCAATTTACAATGGCTCTCCACCAGCAGCA.GCAACACAGTCAAGCTCAACCTCAGCAGT 
CTCAGAACAGGCCATTGTCATTCGGTGGAGACGACGGAACTGCTCTTTACAAGCAGCCGA 
TGAGATCAGTATCACCACCGCAGCAGTACCAACCCAACTCAGCTGGTGAGAATTCTGTCT 
TGAACATGAACTTGCCCGGAGGTGAGTCTGGAGGCATGACTGGAACTGGAAGTGAGCCAG 
TGAAAAAGAGGAGAGGTAGACCGAGGAAATATGGGCCTGATAGTGGTGAAATGTCACTTG 
GTTTGAATCCTGGAGCTCCTTCTTTCACTGTCAGCCAACCTAGTAGCGGCGGCGATGGAG 
GAGAGAAGAAGAGAGGAAGACCTCCTGGTTCTTCTAGCAAAAGGCTCAAGCTTCAAGCTT 
TAGGCTCGACTGGAATCGGATTTACGCCTCATGTACTTACCGTGCTGGCTGGAGAGGATG 
TATCATCCAAGATAATGGCGTTAACTCATAATGGACCCCGTGCTGTGTGTGTCTTGTCTG 
CAAATGGAGCCATCTCCAATGTGACTCTCCGCCAGTCTGCCACATCCGGTGGAACTGTTA 
CATATGAGGGGAGATTTGAGATTCTGTCTTTATCGGGATCTTTCCATTTGCTGGAGAACA 
ATGGTCAAAGAAGCAGGACGGGAGGTCTAAGCGTGTCATTATCAAGTCCGGATGGTAATG 
TCCTCGGTGGCAGTGTAGCTGGTCTTCTTATAGCAGCATCACCTGTTCAGATTGTTGTTG 
GGAGTTTCTTACCAGACGGAGAAAAAGAACCAAAACAGCATGTGGGACAAATGGGACTGT 
CGTCACCCGTATTACCG CGTGTGGCCCCAACGCAGGTGCTGATGACTCCAAGTAG CCCAC 
AATCTCGAGGCACAATGAGTGAGTCATCTTGTGGAGGAGGACATGGAAGCCCTATTCATC 
AGAGCACTGGAGGACCTTACAATAACACCATTAACATGCCCTGGAAGTAGCCAAGTGATC 
TGTGTCGQCTTAAAACCAACAACTTCCCGTTATTAGAGTGATTTATTTCTACATTTGGTT 
TAGACTTTCTAGTTCTGATGGTTATTTCTACAGTTGGTTTAGACTTTCTAGTTCTGTTCA 
GACAAAAGGAGTTTGATAAATTGACCGACCTATTTTGTGTGTTTGAGGTACTTTCAGAAC 
CATAGGTGTTCAGAAATTAGAATGTTCTGTTTAAAAAA 

>G597 Amino Acid Seqa&nc& (domain in AA coordinates: 97-104,137-144) 

MSGSETGLMAATRESMQFTMALHQQQQHSQAQPQQSQNRPLSFGGDDGTALYKQPMRSVS 

PPQQYQPNSAGENSVLNMNLPGGESGGMTGTGSEPVKKRRGRPRKYGPDSGEMSLGLNPG 

APSFTVSQPSSGGDGGEKKRGRPPGSSSKRLKLQALGSTGIGFTPHVLTVIiAGEDVSSKI 

MALTHNGPRAVCVLSANGAISNWLRQ 

RTGGLSVSLSSPDGNVLGGSVAGIiLIAASPVQIWGSFLPDGEKEPKQHVGQMGLSSPVIj 

PRVAPTQVLMTPSSPQSRGTMSESSCGGGHGSPIHQSTGGPYWNTINMPWK* 

>G1009 (28-. 1704) 

AAAAAAAAAAAAAACCTATTCCCAAAGATGAAGAACAATAACAACAAATCTTCTTCTTCT 
TCTAGCTATGATTCTTCTTTGTCTCCTTCTTCTTCATCCTCCTCCCACCAGAACTGGCTC 
TCTTTCTCTCTCTCCAACAATAACAACAACTTCAATTCTTCCTCAAACCCTAATCTCACT 
TCCTCCACATCAGATCATCATCATCCTCACCCTTCTCACCTCTCTCTCTTTCAAGCTTTC 
TCCACTTCTCCAGTCGAACGGCAAGATGGGTCACCGGGAGTTTCACCCAGCGATGCCACG 
GCGGTTCTTTCCGTATACCCCGGCGGTCCTAAACTTGAGAACTTCCTCGGCGGAGGAGCC 
TCAACGACGACAACAAGACCAATGCAACAAGTGCAATCTCTTGGCGGCGTTGTCTTCTCT 
TCCGACCTACAGCCACCGCTTCATCCTCCGTCCGCCGCCGAGATCTACGACTCTGAGCTC 
AAGTCAATAGCCGCTAGCTTCCTAGGAAACTACTCCGGTGGACACTCGTCGGAGGTCTCT 
AGCGTACATAAACAACAACCGAATCCTCTAGCTGTCTCAGAGGCTTCGCCTACTCCGAAG 
AAGT^ACGTAGAGAGTTTTGGACAACGTACCTCGATTTATAGAGGAGTCACAAGACATAGA 
TGGACTGGAAGATACGAAGCTCATCTATGGGATAATAGTTGCCGAAGAGAAGGCCAAAGC 
AGAAAAGGAAGACAAGTTTATTTAGGTGGTTATGATAAGGAAGATAAAGCAGCTAGAGCT 
TACGACCTTGCAGCTCTTAAGTATTGGGGTCCTACAACTACGACTAATTTCCCGATATCA 
AATTACGAATCTGAACTTGAAGAAATGAAACACATGACTCGACAAGAGTTCGTTGCTTCT 
TTAAGACGGAAAAGCAGTGGATTCTCTAGGGGTGCCTCCATGTACAGAGGCGTCACTAGA 
CATCATCAGCATGGTCGATGGCAGGCACGAATTGGAAGAGTTGCAGGCAACAAAGACCTT 
TATCTTGGCACATTTAGCACTCAAGAGGAAGCTGCAGAAGCTTATGATATAGCAGCGATC 
AAATTCCGCGGTCTAAATGCAGTCACCAATTTCGA 

ATTGCTAGCTGTAATCTCCCTGTGGGTGGACTAATGCCTAAACCTTCTCCAGCAACCGCA 
GCGGCTGACAAAACCGTTGATCTTTCTCCATCCGACTCTCCATCTCTAACCACACCGTCC 
CTCACGTTCAATGTGGGAACACCGGTCAATGACCATGGAGGAACTTTTTACCAGACTGGT 
ATACCAATCAAACCAGACCCGGCTGATCATTATTGGTCCAACATCTrTGGATTCCAGGCA 
AACCCGAAAGCAGAAATGCGACCATTAGCAAACTTTGGGTCGGATCTTCATAAGCCTTCT 
CCTGGTTATGCTATAATGCCGGTAATGCAGGi^GGTGAAAACAACTTTGGTGGTAGTTTT 
GTTGGGTCTGATGGGTATAACAATCATTCCGCTGCATCGAACCCGGTCTCAGCAATTCCG 
CTGTCCTCGACAACTACAATGAGTAACGGTAACGAAGGGTATGGTGGAAACATAAACTGG 
ATTAATAACAACATTTCAAGTTCTTACCAAACTGCAAAATCAAATCTCTCTGTTTTGCAC 
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ACACCGGTTTTTGGGTTGGAATGAGTATTCACATCTTAGTGAGAACTAAAATAAATATGT 
AGGAAAAAAATAAGGCTCTGTTTGAAGAAATCAGATATTTTCTTCTTAGATTATTTAAGT 
AGTTTAAAAAAAATATTTTTTAAGTGTTTCACTTTTACGTTTGTCTGCTGACCACGAATT 
TTGCTGGATCTGACAGTACTAACTCTTTGTTTAATGACCTTATGGGTTCCTTTTTTACTT 
TCCAGAACTTTTATTTACTTTTTTCTTCATTTTTCTTGATTTTTTTTGTTGTGGGACAAT 
ATGAATGATTGAAGATGGAAACTGCTTGCATGTGAATAAACGAAAATCAAACNATCTTCG 
GTAACTTAAAAA 

>G1009 Amino Acid Sequence (domain in aa coordinates: 201-277, 303-371) 
MKNtftTOKSSSSSSYDSSLSPSSSSSSHQl^^ 

HPSHLSLFQAFSTSPVERQDGSPGVSPSDATAVLSWPGGPKLENFLGGGASTTTTRPMQ 
QVQSLGGVVFSSDLQPPLHPPSAAEIYDSEIiKSIAASFLGls^SGGHSSEVSSVIiKQQPNP 
LAVSEASPTPKKNVESFGQRTSIYRGOTRHRWTGRYEAH^ 
GYDKEDKAARAYDLAAIiKYWGPTTT^ 

RGASMYRGVTRHHQHGRWQARIGRVAGNKDLYIiGTFSTQEEAAEAYDIAAIKFRGLNAVT 

NFDISRYDVKSIASCmjPVGGLMPKPSPATAAADKTVDLSPSDSPSLTTPSIiTFNVATPV 

NDHGGTF YHTG I P I KPDPADHYWSNI FGFQANPKAEMRPLANFGSDLHNP S PG YAI MP VM 

QEGENNFGGS FVGSDGYNTSTHS AASl^fSAI PLS STO^ 

QTAKSNLSVLHTPVFGLE* 

>G170 (1.-1107) 

ATGGGGATGAAGAAGGTGAAGCTATCTTTGATAGCTAATGAAAGATCAAGGAAAACATCC 

TTCATAAAGAGGAAAGACGGGATTTTTAAGAAACTCCACGAGTTGTCAACTCTGTGTGGT 

GTCCAAGCTTGTGCTCTCATCTACAGTCCA.TTCATACCGGTTCCAGAGTCATGGCCGTCA 

AGGGAAGGTGCTAAAAAGGTGGCTTCAAGGTTTCTGGAGATGCCGCCGACAGCCCGAACC 

AAGAAGATGATGGATCAAGAGACTTACCTTATGGAGAGGATTACCAAAGCAAAAGAGCAA 

CTAAAGAACCTGGCTGCTGAGAACCGAGAGTTACAGGTTAGACGATTTATGTTTGATTGT 

GTTGAAGGCAAAATGTCCCAGTATCATTATGATGCAAAAGACCTTCAAGATTTGCAATCT 

TGTATAAATCTATATCTCGATCAGCTTAACGGAAGGATCGAGTCCATTAAAGAAAATGGT 

GAGTCGTTGTTGTCTTCCGTCTCTCCTTTTCCTACTAGAATTGGTGTTGACGAAATTGGT 

GATGAGTCATTTTCCGACTCTCCTATTCATGCTACAACTGGGGTTGTAGATACTCTTAAT 

GCTACCAATCCTCATGTTCTTACGGGCGATATGACTCCTTTTCTTGATGCGGACGCAACT 

GCGGTAACTGCTTCCAGTAGATTTTTTGATCATATTCCATATGAAAATATGAATATGAGT 

CAAAATCTGCATGAACCGTTTCAACACCTTGTTCCTACTAACGTTTGTGATTTTTTTCAA 

AATCAGAATATGAATCAGGTTCAATACCAGGCTCCTAATAATCTGTTTAATCAGATTCAA 

CGAGAATTCTACAACATAAATTTGAATCTGAATTTGAATCTGAATTCGAATCAGTATCTG 

AATCAACAACAATCATTGATGAATCCGATGGTGGAACAACATATGAA 

CGTGAAAGGATTCCTTTCGTGGACGGAAACTGCTACAACTACCATCAACTACCATCCAAT 

CAACTACCAGCCGTTGATCATGCTTC CAC CAGTTACATGCCTTCCACCACCGGTGTCTAT 

GATCCTTACATCAACAATAATCTCTAA 

>G170 Amino Acid Sequence (domain in aa coordinates: 2-57) 

MGMKKVKLSI1IANERSRKTSFIKRKDGIFKKLHELSTI1CGVQACALIYSPFIPVPESWPS 

REGAKKVASRFLEMPPTARTKKMMDQETYLMERI^ 

VEGKMSQYHYBAKDLQDLQSCINIiYLDQLNGRIESIKENGESLLSSVSPFPTRIGVDEIG 
DE S FSD S P IHATTGWDTLNATNPHVLTGDMTP FLDADATAVTASSRFFDHI PYENMNMS 
QNLHEPFQHLVPTNVCDFFQNQNMNQV^ 

NQQQSFMNPMVEQHMNETVGGRESIPFVDGNCYNYHQLPSNQLPAVDHASTSY 

DPYINNNL* 

>G1768 (185. .1426) 

CTTCCTTTTGCTTCAGCTGCGAGCTTTGGTTGGATCTCTCACTTGCAAAACCAAATCCCT 
TATCGACTTCCACCGAAAGATCACTTCTTAACCTACACAAGGTGTTTGTTATGAAGATCA 
GATAAATAAAAGGTCATTTGAGGATAATGGTTGATGTTCAAAGATTCTTACTTGCTTATT 
TGTGATGGACAATGTAAGAGGTTCAATAATGTTGCAGCCACTGCCAGAGATAGCTGAGAG 
TATCGATGATGCTATCTGCCATGAACTCTCCATGTGGCCTGATGATGCTAAAGATTTGTT 
ATTGATAGTGGAGGCAATATCAAGGGGAGACTTGAAGTTGGTACTTGTTGCTTGTGCAAA 
AGCTGTTTCTGAGAATAATCTTCTAATGGCACGATGGTGTATGGGTGAGTTGCGCGGTAT 
GGTTTCGATTTCTGGTGAGCCAATCCAGAGATTGGGAGCTTATATGTTAGAAGGGCTTGT 
TGCTAGGCTTGCTGCTTCTGGTAGTTCGATATATAAGTCTCTCCAGTCCAGAGAACCAGA 
GAGTTATGAATTTTTATCTTATGTGTATGTTCTGCATGAGGTTTGTCCATATTTCAAGTT 
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TGGATACATGTCAGCGAATGGTGCGATTGCAGAAGCAATGAAGGATGAAGAGAGGATTCA 
CATTATTGACTTCCAAATTGGACAAGGGAGCCAGTGGATAGCACTTATCCAGGCTTTTGC 
AGCTAGGCCTGGTGGGG CTCCAAATATTCGAATTAC CGGAGTTGGTGATGGATCTGTCTT 
GGTTACAGTCAAGAAGAGACTAGAGAAACTTGCAAAGAAGTTTGATGTTCCATTCAGGTT 
CAATGCGGTTTCAAGGCCAAGTTGTGAAGTTGAAGTGGAAAATCTTGATGTCCGAGATGG 
CGAAGCCCTTGGAGTGAACTTTGCTTACATGCTGCATCATTTGCCAGATGAGAGTGTAAG 
CATGGAAAACCACAGGGACCGGTTGCTGAGGATGGTGAAGAGTCTATCACCTAAAGTAGT 
CACTCTTGTGGAACAAGAATGCAACACGAACACTTCCCCTTTCCTTCCTAGGTTCCTTGA 
GACATTAAGTTATTACACGGCAATGTTCGAATCTATCGATGTTATGCTTCCGAGAAATCA 
CAAGGAAAGGATCAATATCGAGCAGCACTGCATGGCAAGGGATGTCGTCAACATCATAGC 
TTGTGAAGGAGCCGAGAGGATCGAAAGACACGAGCTTCTCGGGAAATGGAAGTCAAGGTT 
TTCCATGGCGGGTTTTGAGCCATACCCCTTGAGCTCAATCATTTCAGCCACCATTAGAGC 
CCTCTTGAGAGATTACAGCAACGGGTATGCGATTGAAGAAAGAGATGGTGCTCTGTACCT 
TGGTTGGATGGACCGAATCTTGGTCTCATCTTGTGCATGGAAGTGAAAGAATAAACGTCT 
CCAAGAATGTAATGCAAAAGACAGAACTGGAAGTAATAGATAGTTTTGTCTCATAACCAT 
TAATAAGGTTGAATCAAATCATATACATCCCCATGCTACAACTATTACACAGGCTCCATC 
AACAAAGAAGGGCTCTTGTTGTGTTACCTTCTCTTCCTGTAACTCTTATTTGAACCAAAT 
GGAAGTGGTTACAT 

>G1768 Amino Acid Sequence (domain in AA coordinates: 54-413) 

I^NVRGSIMLQPLPEIAESIDDAICHELSMWPDD 

VSEIWIjLMARWCMGEIjRGMVSISGEPIQRL^ 

YEFLSYVYVLHEVCPYFCTGYMSANGAIAEAMKDEERIHIIDFQIGQGSQWIAIilQAFAA 

RPGGAPNIRITGVGDGSVLVTVKKRLEKDAKXFDVPFRFNAVSRPSCEVE 

ALGVNFAYMLHHLPDESVSMENHRDRL^ 

LSYYTAMFESIDVMLPRiraKERINIEQHCMARDVVNIIACEGAERIERHE 
MAGFE PYPLS S 1 1 S ATIRALLRD YSNG YAIEERDGALYLG WMDRI LVS S CAWK* 
>G185 (77.. 988) 

ATGCAAAAATAAACATAGTAACAATACTTTAAACTATTTACACCACTTTAATCTTATTCT 
CCACTCTTTGAACGTAATGGAGAAGAACCATAGTAGTGGAGAGTGGGAGAAGATGAAGAA 
CGAGATCAACGAGCTAATGATAGAAGGAAGAGACTATGCACACCAGTTTGGATCAGCTTC 
ATCTCAAGAAACACGTGAACATTTAGCCAAAAAGATTCTTCAATCTTACCACAAGTCTCT 
CACCATCATGAACTACTCCGGCGAACTTGACCAAGTTTCTCAGGGTGGAGGAAGCCCCAA 
GAGCGATGATTCCGATCAAGAACCACTTGTCATCAAGAGTTCGAAGAAGTCAATGCCAAG 
GTGGAGTTCAAAAGTCAGAATTGCCCCTGGAGCTGGTGTTGATAGAACGCTGGACGATGG 
ATTCAGTTGGAGAAAGTACGGCCAGAAGGATATTCTCGGAGCC?VAATTTCCAAGAGGATA 
CTATAGATGCACGTATAGAAAGTCTCAAGGATGTGAAGCCACTAAACAAGTCCAAAGATC 
TGATGAAAATCAGATGCTCCTTGAGATCAGTTACCGAGGAATACATTCTTGCTCTCAAGC 
TGCAAATGTCGGTACAACAATGCCGATACAAAACCTCGAACCGAACCAGACCCAAGAACA 
CGGAAATCTTGACATGGTAAAGGAAAGTGTAGACAACTACAATCACCAAG(^CATTTGCA 
TCACAACCTTCACTATCCATTGTCATCTACCCCAAATCTAGAGAATAACAATGCCTATAT 
GCTTCAAATGCGAGATCAAAACATCGAATATTTTGGATCTACGAGCTTCTCTAGTGATCT 
AGGAACTAGTATCAACTACAATTTTCCAGCATCTGGCTCGGCTTCTCACTCAGCATCAAA 
CTCTCCGTCCACCGTCCCTTTGGAATCCCCGTTTGAAAGCTATGATCCAAATCATCCATA 
TGGAGGATTTGGTGGGTTCTATTCTTAGTTATCTACTTAAGGGAGGGACGGAACTTTTTA 
CATGACCTCTTGATTAAAGAGAGAGTTTTCATAATAGCTAATCAATTTCCTATTCAAATA 
TCCGAGTTTTTTTTCTAATGATGTTTATCAATTGTCTTATTACAGAAGGCTTATTTTCAG 
GTCTATGTTGAAATAAATGGATTTGTACTCGTAGGTATGATCCTTGTTATCTAAAAAAAA 
AAAAA r- 

>G185 Amino Acid Sequence (domain in AA coordinates: 113-172) 
MEKIOTSSGEWEKMKNEINELMIEGRDYAHQFGSA 

SGELDQVSQGGGSPKSDDSDQEPLVIKSSKKSMPRWSSKVRIAPGAGVDRTLDDGFSWRK 

YGQKDILGAKFPRGYYRCTYRKSQGCEATKQVQRSDENQMIiLEISYRGIHSCSQAANVGT 

TMPIQNLEPNQTQEHGNLDWKESVDNYNHQAHLHHNLHYPLSSTPNLENl^ 

QNIEYFGSTSFSSDLGTSINYNFPASGSASHSASNSPSTVPLESPFESYDPIJHPYGGFGG 

FYS* 

>G1931 (5.. 592) 

ATCAATGGAAGGGGTTGACAACACAAATCCTATGTTAACCCTAGAAGAAGGCGAAAACAA 
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CAATCCTTTTTCTTCCTTAGATGACAAAA.CATTAATGATGATGGCTCCTTCGTTAATCTT 

TTCGGGCGATGTAGGTCCATCTTCTTCTTCTTGTACTCCAGCAGGTTATCATCTATCTGC 

TCAGCTGGAGAACTTTCGAGGAGGTGGAGGAGAGATGGGAGGATTAGTGAGTAATAATAG 

CAATAATAGTGATCATAATAAGAATTGCAACAAAGGAAAAGGGAAGAGAACTTTGGCAAT 

GCAGAGGATAGCTTTTCATACAAGGAGTGATGATGATGTTCTTGATGATGGTTATCGTTG 

GCGAAAGTACGGTCAGAAATCTGTCAAGAACAATGCTCATCCCAGGAGCTATTATAGATG 

TACATACCACACATGCAACGTGAAGAAACAAGTGCAAAGACTGGCAAAAGATCCAAACGT 

TGTCGTAACAACCTACGAAGGTGTTCATAATCATCCTTGTGAGAAGCTCATGGAGACTCT 

TAGCCCTCTCCTTAGGCAACTTCAGTTCCTCTCAAGAGTTTCTGATCTGTAATTATTGAA 

TGTTAATTAGTGGTGTAATACATTAATTATGCTTTAATCTCTCCATTGACCCTCAATC 

>G1931 Amino Acid Sequence (domain in AA coordinates: 114-170) 

MEGVDNTNPMLTLEEGENttnro^ 

LENFRGGGGEMGGLVSNNSNNSDHNKNCNKGKGKRTIiAMQRIAFHTRSDDDVLDDGYRWR 

KYGQKSVKI^AHPRSYYRCTYHTCNVKKQVQRJ^ 

PIiIiRQLQFLSRVSDL* 

>G2543 (1..2169) 

ATGAGTTTCGTCGTCGGCGTCGGCGGAAGTGGTAGTGGAAGCGGCGGAGACGGTGGTGGT 

AGTCATCATCACGACGGCTCTGAAACTGATAGGAAGAAGAAACGTTACCATCGTCACACC 

GCTCAACAGATTCAACGCCTTGAATCGAGTTTCAAGGAGTGTCCTCATCCAGATGAGAAA 

CAGAGGAACCAGCTTAGCAGAGAATTGGGTTTGGCTCCAAGACAAATCAAGTTCTGGTTT 

CAGAACAGAAGAACTCAGCTTAAGGCTCAACATGAGAGAGCAGATAATAGTGCACTAAAG 

GCAGAGAATGATAAAATTCGTTGCGAAAACATTGCTATTAGAGAAGCTCTCAAGCATGCT 

ATATGTCCTAACTGTGGAGGTCCTCCTGTTAGTGAAGATCCTTACTTTGATGAACAAAAG 

CTTCGGATTG7\AAATGCACACCTTAGAGAAGAGCTTGAAAGAATGTCTACCATTGCATCA 

AAGTACATGGGAAGACCGATATCGCAACTCTCTACGCTACATCCAATGCACATCTCACCG 

TTGGATTTGTCAATGACTAGTTTAACTGGTTGTGGACCTTTTGGTCATGGTCCTTCACTC 

GATTTTGATCTTCTTCCAGGAAGTTCTATGGCTGTTGGTCCTAATAATAATCTGCAATCT 

CAGCCTAACTTGGCTATATCAGACATGGATAAGCCTATTATGACCGGCATTGCTTTGACT 

GCAATGGAAGAATTGCTCAGGCTTCTTCAGACAAATGAACCTCTATGGACAAGAACAGAT 

GGCTGCAGAGACATTCTCAATCTTGGTAGCTATGAGAATGTTTTCCCAAGATCAAGTAAC . 

CGAGGGAAGAACCAGAACTTTCGAGTCGAAGCATCAAGGTCTTCTGGTATTGTCTTCATG 

AATGCTATGGCACTTGTCGACATGTTCATGGATTGTGTCAAGTGGACAGAACTCTTTCCC 

TCTATCATTGCAGCTTCTAAAACACTTGCAGTGATTTCTTCAGGAATGGGAGGTACCCAT 

GAGGGTGCATTGCATTTGTTGTATGAAGAAATGGAAGTGCTTTCGCCTTTAGTAGCAACA 

CGCGAATTCTGCGAGCTACGCTATTGTCAACAGACTGAACAAGGAAGCTGGATAGTTGTA 

AACGTCTCATATGATCTTCCTCAGTTTGTTTCTCACTCTCAGTCCTATAGATTTCCATCT 

GGATGCTTGATTCAGGATATGCCCAATGGATATTCCAAGGTTACTTGGGTTGAACATATT 

GAAACTGAAGAAAAAGAACTGGTTCATGAGCTATACAGAGAGATTATTCACAGAGGGATT 

GCTTTTGGGGCTGATCGTTGGGTTACCACTCTCCAGAGAATGTGTGAAAGATTTGCTTCT 

CTATCGGTACCAGCGTCTTCATCTCGTGATCTCGGTGGAGTGATTCTATCACCGGAAGGG 

AAGAGAAGCATGATGAGACTTGCTCAGAGGATGATCAGCAACTACTGTTTAAGTGTCAGC 

AGATCCAACAACACACGCTCAACCGTTGTTTCGGAACTGAACGAAGTTGGAATCCGTGTG 

ACTGCACATAAGAGCCCTGAACCAAACGGCACAGTCCTATGTGCAGCCACCACTTTCTGG 

CTTCCCAATTCTCCTCAAAATGTCTTCAATTTCCTCAAAGACGAAAGAACCCGTCCTCAG 

TGGGATGTTCTTTCAAACGGAAACGCAGTGCAAGAAGTTGCTCACATCTCAAACGGATCA 

CATCCTGGAAACTGCATATCGGTTCTACGTGGATCCAATGCAACACATAGCAACAACATG 

CTTATTCTGCAAGAAAGCTCAACAGACTCATCAGGAGCATTTGTGGTCTACAGTCCAGTG 

GATTTAGCAGCATTGAACATCGCAATGAGCGGTGAAGATCCTTCTTATATTCCTCTCTTG 

TCCTCAGGTTTCACAATCTCACCAGATGGAAATGGCTCAAACTCTGAACAAGGAGGAGCC 

TCGACGAGCTCAGGACGGGCATCAGCTAGCGGTTCGTTGATAACGGTTGGGTTTCAGATA 

ATGGTAAGCAATTTACCGACGGCAAAACTGAATATGGAGTCGGTGGAAACGGTTAATAAC 

CTGATAGGAACAACTGTACATCAAATTAAAACCGCCTTGAGCGGTCCTACAGCTTCAACT 

ACAGCTTGA 

>G2543 Amino Acid Sequence (domain in AA coordinates: 31-91) 
MSFWGVGGSGSGSGGDGGGSHHHDGSETDRKKKRYHRHTAQQIQRIiESSFKECPHPDEK 
QRNQLSRELGLAPRQIKFWFQNRRTQLKAQHERADNSALK7^NDKIRCENIAIREALKHA 
ICPNCGGPPVSEDPYFDEQKLRIENAHLREELERMSTIASKYMGRPISQLSTLHPMHISP 
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LDLSMTSLTGCGPFGHGPSIJDFDLLPGSSMAVGPNNNLQSQPNLAISDiyiDKPIMTGIALT 
AMEELIiRLLQTNEPLWTRTDGC^ 

NAMALVDMFMD CVKWTELFPS 1 I AAS KTLAVI S SGMGGTHEGALHLL YEEMEVLS PLVAT 
REFCELRYCQQTEQGSWIVVNVSYDIjPQFVSHSQSYRFPSGCLIQDMPNGYSKVTWVEHI 
ETEEKELVHEL YRE I IHRGI AFGADRWVTTLQRMCERFASLSVPAS SSRDLGGVI LS PEG 
KRSMMRLAQRMISNYCLSVSRSNNTRSTWSELNEVG 

LPNSPQNVFNFLKDERTRPQWDVLSNGNAVQEVAHISNGSHPGNCISVLRGSNATHSNNM 

LILQESSTDSSGAEVVYSPVDLAALNIAMSGEDPSYIPLLSSGFTISPDGNGSNSEQGGA 

STSSGRASASGSLITVGFQIIWSmPTAKIiNM^ 

TA* 

>G264 (30. .1430) 

CTTGTACCAGTTTCTGATTAGATTCAACAATGAACGGCGCATTAGGTAACTCCTCCGCCT 

CCGTTAGCGGCGGAGAAGGAGCCGGAGGACCAGCGCCTTTCTTGGTGAAAACCTACGAGA 

TGGTCGACGATTCATCAACGGACCAGATCGTATCGTGGAGCGCTAACAACAACAGCTTCA 

TCGTTTGGAATCATGCCGAATTTTCACGCCTCCTTCTTCCAACCTACTTCAAACACAATA 

ACTTCTCTTCCTTCATTCGTCAGCTCAATACCTATGGGTTTAGGAAGATTGATCCAGAGA 

GGTGGGAGTTTTTGAATGATGATTTTATTAAGGATCAGAAGCATCTTCTCAAGAATATAC 

ATAGAAGGAAACCTATACACAGCCACAGTCATCCACCTGCTTCGTCGACTGATCAAGAAA 

GAGCAGTGTTGCAAGAGCAAATGGACAAGCTTTCACGTGAGAAAGCTGCAATTGAAGCTA 

AGCTTTTAAAGTTCAAACAACAGAAGGTTGTAGCAAAGCATCAGTTTGAAGAAATGACTG 

AGCATGTTGATGATATGGAGAATAGGCAGAAGAAGCTGCTGAATTTTTTGGAAACTGCGA 

TTCGGAATCCTACTTTTGTTAAGAATTTTGGTAAGAAAGTCGAGCAGTTGGATATTTCAG 

CTTACAACAAAAAGCGAAGGCTCCCTGAAGTTGAGCAATCAAAGCCACCTTCAGAAGATT 

CTCATCTGGATAATAGTAGTGGTAGCTCGAGACGCGAGTCTGGAAACATTTTTCATCAAA 

ATTTCTCTAATAAATTGCGACTAGAGCTTTCTGCAGCTGATTCAGATATGAACATGGTTT 

CACACAGTATACAAAGTTCCAATGAAGAAGGTGCGAGTCCCAAAGGGATACTGTCAGGAG 

GTGATCCAAATACTACACTAACAAAAAGAGAAGGCCTACCATTTGCACCTGAAGCTCTAG 

AGCTTGCGGATACCGGGACATGCCCGAGGAGATTACTGTTAAATGATAATACAAGGGTGG 

AGACCTTGCAGCAGAGGCTAACTTCTTCAGAGGAGACTGATGGTAGCTTTTCATGTCATT 

TAAATCTAACCCTGGCTTCTGCTCCGTTACCGGACAAAACAGCTTCACAGATAGCTAAGA 

CGACTCTTAAAAGTCAGGAGTTA^CTTTAACTCAATAGAAACAAGTGCAAGTGAGAAAA 

ATCGGGGTAGACAAGAGATTGCAGTTGGAGGTAGCCAAGCAAATGCAGCTCCTCCAGCAA 

GAGTGAATGATGTATTCTGGGAACAGTTCCTAACAGAAAGGCCAGGGTCTTCAGATAATG 

AGGAGGCAAGTTCGACTTATAGAGGTAACCCATACGAAGAGCAAGAGGAGAAAAGAAACG 

GGAGTATGATGTTACGTAATACAAAGAATATCGAGCAGCTGACCTTATAAACTATTTGGA 

CGGTTACATCAACGAGAGTACGAACTGAGGTTTTGGTAAGAAGTATC3GGTGAGTAAGTAA 

TGAAACATTGGACTGAAAAAGCGTAAGTAGCTTTGTTGTAAACACTTGCGTCTCTGTCTA 

CACAAGTAATTTGACTGTAAATGTAAGTGTACAGGATTTAAATTGAATAAGCA 

>G264 Amino Acid Sequence (domain in AA coordinates: 24-114) 

MNGALGNSSASVSGGEGAGGPAPFLVKTYEN^ 

LLLPTYFKHNNFSSFIRQLNTYGFRKIDPERWEFLETODF 

HPPASSTDQERAVLQEQMDKLSREKAAIEAKLLKFKQQKW^ 

KKLLNFLETAIRNPTFVKNFGKKVEQ 

RRESGNIFHQNFSNKLRLELSPADSDMNMVSHSIQSSNEEGASPKGILSGGDPNT 
EGLPFAPEALEIJUDTGTCPRRLLLiroNTRra 

PDKTASQIAKTTLKSQELNFNSIETSASEKNRGRQEIAVGGSQANAAPPARVNDVFWEQF 

LTERPGSSD1TEEASSTYRGNPYEEQEEKRNGSMMLRNTKNIEQLTL* 

>G32 (101.. 736^- 

AACACACATTCCCTCTCTTCCTTCAACTAGAAAAAAGATAGATATATCGGACATTTATTG 
ATCTGTGTATGCATAAAGGTATAGTATCATTATTAGAAAGATGAACACAACATCATCAAA 
GAGCAAGAAGAAGCAAGACGATCAGGTTGGTACAAGGTTTCTTGGGGTGAGAAGAAGGCC 
TTGGGGAAGATACGCAGCTGAGATTAGAGACCCAACTACGAAGGAGCGTCACTGGCTTGG 
CACTTTCGATACGGCGGAAGAAGCTGCCTTGGCCTACGATAGAGCTGCTCGGTCCATGCG 
TGGCACACGTGCCAGAACCAACTTTGTTTACTCAGACATGCCTCCTTCCTCATCCGTCAC 
CTCCATTGTTTCTCCTGACGATCCTCCTCCTCCTCCACCTCCTCCTGCTCCTCCTAGCAA 
TGATCCTGTCGATTACATGATGATGTTTAACCAATACTCATCCACTGACTCGCCAATGCT 
TCAGCCTCATTGTGATCAAGTGGACAGTTACATGTTTGGTGGCTCTCAATCTTCGAATTC 
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TTATTGCTATTCTAATGACAGTAGTAATGAGCTGCCTCCTCTCCCGAGCGACTTGTCGAA 
TTCGTGTTATAGCCAACCACAGTGGACCTGGACCGGTGACGACTACTCGTCTGAGTACGT 
ACATAGTCCAATGTTGAGCAGAATGCCTCCGGTTTCTGACTCTTTCCCTCAAGGTTTCAA 
CTACTTTGGCTCCTAATTCTTTCTCATCGTCCATATTTAATACCTTCCTCATTTGTACCT 
TTTCCTTCTTCTTCTTTTTTGGGTTTATCTATGTTTCGCCGTCCTTGATCTCTGCCTATG 
TGATCAAAGTGACTGTTTGTCATTAGTTTTTGAATAACAAGTTATCATTTGTATCTTGAA 
AAAAAAAAAAA 

>G32 Amino Acid Sequence (domain in aa coordinates: 17-84) 
MNTTS SKS KKKQDDQVGTRFLGVRRRPWGRYAAE IRDPTTKERHWLGTFDTAEEAALAYD 
RAARSmGTRARTNFVYSDMPPSSSVTSIVSPDDPPPPPPPPAPPSJ^ 
STDSPMLQPHCDQVDSYMFGGSQSSNSYCYSNDSSNELPPLPSDLSNSCYSQPQWTWTGD 
DYS SE YVHS PMFSRMPPVSDS FPQGFNYFGS * 
>G436 (1..2157) 

ATGGATTTTACTCGCGATGACAACTCAAGTGATGAACGGGAAAATGATGTAGACGCCAAC 
ACCAAGAACCGTCACGAGAAGAAGGGTTACCATCGCCACACTAATGAACAAATTCATAGG 
CTTGAAACGTATTTCAAGGAATGTCCTCATCCAGACGAATTTCAGCGACGTCTGTTGGGT 
GAAGAACTGAATCTGAAACCAAAACAAATCAAATTTT^ 

GCTAAGAGTCACAATGAAAAAGCAGACAATGCAGCGCTTAGGGCAGAAAATATTAAGATT 

AGACGTGAGAACGAATCAATGGAAGATGCACTGAATAATGTGGTTTGCCCTCCATGTGGT 

GGTCGTGGTCCTGGGAGAGAAGACCAACTTCGACATCTCCAAAAACTCCGTGCACAAAAC 

GCTTATCTCAAAGATGAGTATGAAAGAGTCTCAAACTACCTAAAACAGTACGGAGGTCAC 

TCAATGCATAACGTCGAGGCCACACCCTATCTCCATGGTCCATCAAACCATGCATCAACG 

TCCAAGAACCGTCCAGCATTGTACGGAACCTCTTCTAACCGTCTCCCCGAGCCTTCAAGC 

ATATTTAGAGGACCATACACTCGTGGAAACATGAACACCACCGCACCGCCTCAGCCGCGA 

AAGCCGCTGGAAATGCAGAATTTCCAACCACTATCTCAACTGGAGAAAATTGCAATGTTG 

GAAGCAGCGGAAAAAGCGGTGTCAGAGGTTTTGAGCCTCATTCAAATGGATGATACAATG 

TGGAAAAAGTCGTCTATTGATGATAGGCTCGTCATTGATCCAGGGCTCTATGAGAAATAT 

TTTACTAAGACTAACACAAATGGTCGTCCTGAGTCTTCTAAAGATGTCGTGGTGGTTCAA 

ATGGATGCTGGAAACTTGATCGACATCTTCTTAACTGCGGAGAAATGGGCGAGGCTTTTT 

CCAACAATTGTGAACGAAGCTAAAACGATTCACGTCTTGGATTCCGTTGACCATCGAGGA 

AAAACTTTCTCAAGAGTGATTTATGAGCAACTGCACATACTGTCACCATTGGTGCCACCG 

AGGGAATTTATGATCCTAAGGACTTGCCAACAAATTGAAGACAATGTCTGGATGATTGCT 

GATGTGTCGTGTCATCTCCCAAACATTGAGTTTGATCTTTCGTTTCCCATTTGCACCAAA 

CGTCCCTCAGGTGTGCTCATTCAAGCCTTGCCCCACGGCTTCTCTAAGGTGACGTGGATA 

GAGCATGTGGTAGTGAATGATAATAGAGTGCGGCCACATAAGCTTTACAGAGACCTCTTA 

TACGGCGGCTTTGGCTACGGAGCTCGACGTTGGACCGTTACTCTTGAGAGGACGTGTGAG 

AGGCTGATTTTCTCCACCTCCGTCCCTGCCTTGCCCAACAATGACAATCCCGGAGTTGTG 

CAAACAATACGAGGCAGAAATAGCGTAATGCATTTGGGAGAAAGAATGTTGAGGAACTTT 

GCATGGATGATGAAAATGGTTAACAAACTCGACTTCTCGCCACAGTCTGAAACTAACAAC 

AGCGGAATTAGGATTGGGGTGCGGATAAACAATGAGGCGGGTCAACCGCCCGGTCTCATT 

GTCTGTGCTGGTTCATCTTTATCCCTCCCTCTCCCTCCTGTCCAAGTGTACGATTTCCTT 

AAGAATCTGGAGGTTCGTC AC C AGTGGG ACGTTCTGTGCC ATGGGAATC C AG CGACTGAG 

GCTGCTCGTTTCGTCACCGGATCAAACCCAAGGAACACTGTGTCTTTTCTCGAGCCTTCA 

ATTAGGGATATTAATACTAAGCTAATGATACTCCAAGATAGCTTCAAAGATGCATTGGGA 

GGAATGGTGGCCTACGCTCCAATGGATCTAAACACCGCCTGCGCTGCCATTTCAGGCGAT 

ATCGATCCTACCACCATTCCAATCCTCCCTTCCGGTTTTATGATCTCCCGTGACGGCCGT 

CCTTCCGAGGGCGAAGCCGAGGGTGGCAGCTATACACTCCTCACCGTGGCTTTCCAGATC 

CTTGTCTCCGGTCCSAGTTACTCTCCTGATACCAACCTGGAAGTTTCTGCCACCACAGTC 

AATACCTTGATTAGCTCCACCGTTCAAAGGATCAAAGCCATGCTCAAGTGCGAATGA 

>G436 Amino Acid Sequence (domain in AA coordinates: 22-85) 

MDFTRDDNS SDERENDVDANTNNRHEKKGYHRHTNEQ IHRLETYFKECPHPDEFQRRLL.G 

EELNLKPKQIKFWFQNKRTQAKSHNEKADNAALRM 

GRGPGREDQLRHLQKLRAQNAYLKDEYERVSNYLKQYGGHSMHWEATPYLHGPSMIAST 
SKKRPALYGTSSNRLPEPSSIFRGPYTRGNMNTTAPPQPRKPLEMQNFQPLSQLEKIAML 
EAAEKAVSEVLSL I QMDDTMWKKS S IDDRLVIDPGL YEKYFTKTNTNGRPES S KDVWVQ 
MDAGNLIDIFLTAEKWARLFPTIWEAKTIHVLDSVDHRGKTFSRVIYEQLHILSPLVPP 
REFMILRTCQQIEDNWMIADVSCHLPNIEFDLSFPICTKRPSGVIilQALPHGFSKVTWI 
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EHVVVNDimWPHKIiYRDLLYGGFGYGA^ 

QT IRGRNS Vl^GERMLI^FAWMMKMVNKLDFS PQS ETNNSGIRIGVRINNEAGQPPGLI 
VCAGSSLSLPLPPVQVYDFLKNLEVRHQWDVLCHGNPATEAARFVTGSNPRNTVSFLEP^ 
IRDINTKLMILQDSFKDALGGMVAYAPMDIiNTACAAISGDIDPTTIPlLPSGFMISRDGR 
PSEGEAEGGSYTLLTVAFQILVSGPSYSPDTNLEVSATTWTLISSTVQRIKAMLKCE* 
>G556 (50. .1144) 

CTTTTTTGAAGCCCTTTTGACACAAAAGACCAGAACAAGTTGAAGAAATATGAATACAAC 
CTCGA^CATTTTGTTCCACCGAGAAGGTTTGAAGTTTACGAGCCTCTCAACa^^TC^ 
TATGTGGGAAGAAAGTTTCAAGAACAATGGAGACATGTATACGCCTGGCTCTATCATAAT 
CCCGACTAACGAAAAACCAGACAGCTTGTCAGAGGATACTTCTCATGGGACAGAAGGAAC 
TCCTCACAAGTTTGACGAAGAGGCTTCCACATCTAGACATCCTGAT^ 

GCTAGCACAGAATCGAGAGGCAGCTAGGAAAAGTCGTTTGCGCAAGAAAGCTTATGTTCA 
GCAGCTAGAGACTAGCCGGTTAAAGCTAATTCATTTAGAGCAAGAACTCGATCGTGCTAG 
ACAACAGGGTTTCTATGTGGGGAACGGAGTAGATACCAATGCTCTTAGTTTCTCAGATAA 
CATGAGCTCAGGGATTGTTGCATXTGAGATGGAATATGGACATTGGGTGGAAGAACAGAA 
CAGGCAAATATGTGAACTAAGAACGGTTTTACATGGACAAGTTAGTGATATAGAGCTTCG 
TTCTCTAGTCGAGAATGCCATGAAACATTACTTTCAACTCTTCCGAATGAAGTCAGCCGC 
TGCAAAAATCGATGTTTTCTATGTCATGTCCGGAATGTGGAAAACTTCAGCAGAGCGGTT 
TTTCTTGTGGATAGGCGGATTTAGACCCTCAGAGCTTCTCAAGGTTCTGTTACCGCATTT 
TGATCCTTTGACGGATCAACAACTTTTGGATGTATGTAATCTGAGGCAATCATGTCAACA 
ATCAGAAGATGCGTTATCCCAAGGTATGGAGAAACTGCAACATACATTAGCAGAGAGTGT 
AGCAGCCGGGAAACTTGGTGAAGGAAGTTATATTCCTCAAATGACTTGTGCTATGGAGAG 
ATTGGAGGCTTTGGTCAGCTTTGTAAATCAAGCTGATCATCTGAGACATGAGACATTGCA 
ACAGATGCATCGGATCTTAACCACGCGACAAGCGGCTAGAGGTTTGTTAGCATTAGGGGA 
GTATTTCCAAAGGCTTCGAGCTTTGAGTTCGAGTTGGGCGGCTAGGCAACGTGAACCAAC 
GTAATTAAGGTGTTTAGATGTCAAGAAAGGTTTGAGACCTTAACAATCAAGAATGGAGTT 
TGCTGGTGAGTGGATTTTTGGGTCAAGAACAAGAGCAATAACACAAGCTGCTGTGTGATG 
ATGAATCTTGTCTTGCGGCTAAAGGAAATGTTTGAGGAAAGTTGTACATATGATCAGCAA 
CGTAAAGTTTATAGCTTTTTAGAAACCAACTTTTCGATGGTTGTTCTTTTTTTTTTGTAT 
GTAATATTATAGATAAGCTTGTGGTATATATGATTTTAATGTGACATTACGAACTTGATT 
TATAACCATGGTAAAAT 

>G556 Amino Acid Sequence (domain in AA coordinates: 83-143) 
MNTTSTHFVPPRRFF^EPl^QIG 

TEGTPHKFDQEASTSRHPDKIQRRIiAQNREAARKSRLRIQCAYVQQIjETSRLKIjIHLEQEL 

DRARQQGFYVGNGVDTNALSFSDNMS SG I VAFEMEYGHWVEEQNRQI CELRTVLHGQVSD 

IELRSLVENAMKHYFQIiFRMKSAAAKIDWYWSGMWKTSAERFFLWIGGFRPSELLK^ 

liPHFDPLTDQQLLDVCmjRQSCQQSEDALSQGMEKLQHTLAESVAAGKLGEGSYIPQMTC 

AMERLEALVSFWQADHLRHETLQQMHRILTTRQAARGLLALGEYFQRLRALS 

REPT* 

>G1420 (39.. 1238) 

AAAGTATCATCTCATAGATTCCATCTTTTCTCTATTACATGGAGAAGAAAAAAGAAGAGG 
ATCATCATCATCAACAACAACAACAACAACAAAAGGAGATCA^ 

TCGAGCAAGAACAAGAACAAGAACAAAAACAAGT^AATCTCTCAAGCATCATCATCATC^ 
ACATGGCGAATCTAGTTACGTCATCAGATCATCATCCGTTGGAGCTAGCTGGAAATCTCT 
CAAGCATCTTCGATACTTCATCTTTACCTTTTCCTTATTCTTATTTCGAAGATCACTCTT 
CTAATAATCCTAATTCTTTCCTAGACTTGCTCCGACAAGATCATCAGTTTGCTTCTTCCT 
CTAATTCCTCTTCTTTTTCATTCGATGCCTTTCCTGTCCCCAATAACAACAACAACACCT 
CTTTTTTTACGGATTTGCCCTTACCTCAAGCTGAGTCATCAGAAGTCGTGAACACAACAC 
CGACTTCTCCAAACTCAACCTCAGTCTCATCTTCCTCCAACGAAGCTGCAAATGATAACA 
ACAGTGGTAAAGAAGTTACTGTTAAAGATCAAGAAGAAGGAGATCAACAACAAGAGCAAA 
AGGGTACTAAGCCACAGTTGAAGGCAAAGAAGAAGAATCAAAAGAAAGCTAGAGAAGCTA 
GGTTTGCGTTTCTGACGAAGAGCGATATTGATAATCTTGACGACGGTTATAGGTGGAGAA 
AATACGGCCAAAAAGCTGTCAAAAACAGTCCTTATCCCAGAAGCTATTACCGTTGCACCA 
CAGTGGGTTGCGGAGTGAAGAAGAGAGTGGAGAGATCCTCCGATGATCCTTCGATCGTCA 
TGACAACCTACGAAGGTCAGCATACCCATCCTTTCCCCATGACGCCACGTGGACACATCG 
GAATGCTCACGTCACCAATCCTAGACCACGGTGCAACCACCGCGTCATCATCATCATTCT 
CCATCCCTCAGCCACGTTACTTGCTGACTCAACATCACCAGCCCTACAACATGTACAACA 
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ACAACTCTCTAAGTATGATCAATAGAAGATCATCCGATGGCACTTTCGTAAATCCAGGTC 
CATCATCAT(^TTCCCCGGCTTTGGTTATGATATGTCTCAAGCTTCTACTTCAACTTCTT 
CTTCCATTAGAGATCATGGATTGCTTCAAGATATTCTTCCTTCGCAGATCAGATCCGATA 
CTATTAACACTCAAACCAATGAAGAGAATAAGAAATGAAGAAGTTTTTTTTCCCGGGGCA 

>G14 2 0 Amino Acid Sequence (domain in AA coordinates: 221-28 0) 

MEKKKEEDHHHQQQQQQQKEI KNTETKI EQEQEQEQKQEI SQASS S SNMANLVTS SDHHP 

LEIiAGNIiSSIFDTSSLPFPYSYFEDHSSNNPNSFLDIiljRQDHQFASSSNSSSFSFDAFPIi 

PNNNNNTSFFTDLPLPQAESSEVVNTTPTSPNST 

GDQQQEQKGTKPQLKAKKKNQKKAREARFAFL^^ 

RSYYRCTTVGCGVKKRVERSSDDPSIVMTTC 

TASSSSFSIPQPRYLLTQHHQPYNMYNNNSLSMINRRSSDGTFVNPGPSSSFPGFGYDMS 

QASTSTSSSIRDHGIiLQDILPSQIRSDTINTQTNEENKK* 

>G1412 (115. .1008) 

CCCACGCGTCCGCCCACGCGTCCGAAACAAAAACATATAATTTGGGTTTTTAGAGTTCGA 

GTTAGAGAGAAAGATCCGTTAGCC CAGTTGAGTTTGCCACCAGGTTTTAGATTTTATC CG 

ACAGATGAAGAGCTTCTTGTTCAGTATCTATGTCGGAAAGTTGCAGGCTATCATTTCTCT 

CTCCAGGTCATCGGAGACATCGATCTCTACAAGTTCGATCCTTGGGATTTGCCAAGTAAG 

GCTTTGTTTGGAGAGAAGGAATGGTATTTCTTTAGCCCAAGAGATCGGAAATATCCGAAC 

GGGTCAAGACCCAATAGAGTAGCCGGGTCGGGTTATTGGAAAGCAACGGGTACTGACAAA 

ATTATCACGGCGGATGGTCGTCGTGTCGGGATTAAAAAAGCTCTGGTCTTTTACGCCGGA 

AAAGCTCCCAAAGGCACTAAAACCAACTGGATTATGCACGAGTATCGCTTAATAGAACAT 

TCTCGTAGCCATGGAAGCTCCAAGTTGGATGATT.GGGTGTTGTGTCGAATTTACAAGAAA 

ACATCTGGATCTCAGAGACAAGCTGTTACTCCTGTTCAAGCTTGTCGTGAAGAGCATAGC 

ACGAATGGGTCGTCATCGTCTTCTTCATCACAGCTTGACGACGTTCTTGATTCGTTCCCG 

GAGATAAAAGACCAGTCTTTTAATCTTCCTCGGATGAATTCGCTCAGGACGATTCTTAAC 

GGGAACTTTGATTGGGCTAGCTTGGCAGGTCTTAATCCAATTCCAGAGCTAGCTCCGACC 

AATGGATTACCGAGTTACGGTGGTTACGATGCGTTTCGAGCGGCGGAAGGTGAGGCGGAG 

AGTGGGCATGTGAATCGGCAGCAGAACTCGAGCGGGTTGACTCAGAGTTTCGGGTACAGC 

TCGAGTGGGTTTGGTGTTTCGGGTCAAACATTCGAGTTTAGGCAATGAGAGAGATGTGAA 

GTTACTGATGGGTGAAAAAAGTAAAAAAAAAACTTGGAGATAGTAGAGTGGCAATTGATG 

TAAATAATAGGGATTTATATGGGGCTTTTACCGATTCGGTGAGGCTTAGGATTCCCCAAA 

GGAAAAAGGCTCGACTGGGGACTAGTTTGATCCAACTTGACGGCCCCCAAATGTGTAATG 

TTTCTCAACGGAGAGAAAAATAAATGGTTACCAATATTTTTCCAAAAAAAAAAAAAAAAA 

>G1412 Amino Acid Sequence (domain in AA coordinates: 17-159) 

MGVREKDPIiAQLSLPPGFRFYPTDEEIiLVQYLCRKVAGYHFSLQVIGDIDLYKFDPWDLP 

SKALFGEKEWYFFSPRDRKYPNGSRPNRVAGSGYWKATGTDKIITADGRRVGIKKAIjVFY 

AGKAPKGTKTNWIMHEYRLIEHSRSHGSSKLDDWLCRIYKKTSGSQRQAVTPVQACRE^ 

HSTNGSSSSSSSQLDDVLDSFPEIKDQSFNLPRMNSLRTILNGNFDWASIiAGLNPIPELA 

PTNGLPSYGGYDAPRAAEGEAESGHVNRQQNSSGLTQSFGYSSSGFGVSGQTFEFRQ* 

>G738 (1. . 885) 

ATGGACCATCATCAGTATCATCATCATGATCAATACCAACATCAGATGATGACTAGTACT 
AACAATAATTCCTATAACACCATCGTCACAA 

GATTCAACAACAGCAACAACTATGATAATGGATGACGAGAAGAAGTTGATGACGACAATG 
AGCACTAGGCCGCAAGAACCAAGAAACTGTCCAAGATGCAACTCAAGCAACACCAAGTTT 
TGTTATTACAACAACTACAGCTTAGCACAGCCTAGGTACTTGTGTAAGTCTTGTCGGAGA 
TATTGGACTGAAGG^GGCTCTCTCCGTAACGTCCCCGTAGGCGGAGGTTCTAGAAAGAAC 
AAGAAGCTTCCATTTCCTAATTCCTCTACTTCTTCTTCCACCAAGAACCTCCCGGATCTC 
AACCCTCCTTTCGTCTTCACATCATCAGCTTCATCATCAAACCCTAGCAAGACGCATCAA 
AACAATAATGACCTCAGCCTATCCTTCTCCTCCCCTATGCAAGACAAGCGAGCTCAAGGG 
CATTACGGTCATTTCAGTGAGCAAGTTGTGACAGGAGGGCAGAACTGTCTTTTCCAAGCT 
CCTATGGGAATGATTCAGTTTCGTCAAGAGTATGATCATGAGCACCCCAAAAAGAATCTT 
GGGTTTTCATTAGACAGGAACGAGGAAGAGATTGGTAATCATGATAACTTCGTTGTTAAT 
GAGGAAGGAAGTAAGATGATGTATCCTTATGGAGATCATGAAGACCGTCAACAACATCAC 
CATGTGAGACACGATGATGGTAATAAGAAGAGAGAAGGTGGTTCAAGCAATGAGCTATGG 
AGCGGAATCATCCTAGGTGGTGATAGTGGTGGACCAACATGGTGA 
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>G738 Amino Acid Sequence (domain in aa coordinates: 351-393) 
MDHHQYHHHDQYQHQMMTSTl^SYNTI 

STRPQEPRNCPRCNSSNTKFCYYl^YSIiAQPRyiiCKSCRRYWTEGGSLRJWPVGGGSR™ 
KKLPFPNSSTSSSTKNIjPDLNPPFVFTSSASSSNPSKT^ 

HYGHFSBQWTGGQNCLFQAPMGMIQFRQEYDHEHPKKNLGFSLDRNEEEIGNHDNFVVN 
EEGSKMMYPYGDHEDRQQHHHVRHDDGNKKREGGSSNELWSGIILGGDSGGPTW* 
>G242S (1..1038) 

ATGGGCAGATCGCCATGTTGTGATAAGGCCGGGTTGAAGAAAGGGCCTTGGACTCCAGAA 
GAGGATCAGAAACTTTTGGCTTATATTGAAGAACATGGCCATGGAAGCTGGCGTTCTTTG 
CCTGAGAAAGCCGGTCTCCAAAGGTGTGGAAAGAGTTGCAGACTCAGATGGACTAACTAC 
CTAAGACCTGACATCAAGAGAGGCAAATTCACTGTACAAGAAGAACAAACCATCATTCAA 
CTCCACGCTCTCCTCGGAAACAGGTGGTCAGCGATTGCAACTCATTTACCAAAGAGGACA 
GACAACGAGATCTVAGAACTACTGGAACACACACTTGAAGTyVACGTCTGATCAAAATGGGG 
ATAGATCCAGTGACTCACAAGCACAAAAACGAGACTCTTTCGTCTTCCACAGGACAATCA 
AAGAACGCAGCCACGCTTAGTCATATGGCTCAATGGGAGAGTGCAAGACTCGACGCTGAA 
GCAAGGCTAGCTAGAGAATCAAAGCTTCTCCATTTACAGCATTACCAAAACAATAACAAC 
CTTAACAAATCAGCAGCTCCTCAACAACATTGCTTCACTCAAAAAACATCAACAAACTGG 
ACTAAACCAAACCAAGGAAACGGAGACCAACAGCTTGAATCTCCGACATCGACGGTGACA 
TTCTCTGAGAATCTTCTGATGCCTTTAGGAATCCCTACGGATAGCAGCAGAAATAGAAAC 
AATAACAACAATGAGTCCTCGGCGATGATTGAATTGGCCGTATCTTCGTCAACCTCCTCC 
GATGTGAGTCTGGTCAAAGAACATGAACACGACTGGATTAGGCAGATCAACTGTGGTAGT 
GGAGGAATAGGAGAAGGATTCACGAGTCTATTGATCGGTGATTCGGTCGGCCGGGGTTTA 
CCCACCGGGAAAAACGAAGCGACGGCGGGCGTGGGGAATGAGAGTGAGTATAACTACTAT 
GAGGATAACAAGAATTACTGGAATAGCATTCTCAACTTGGTTGATTCTTCACCGTCCGAT 
TCCGCGACGATGTTCTGA 

>G2426 Amino Acid Sequence (conserved domain in AA coordinates : 14-114) 
MGRSPCCDKAGLKKGPWTPEEDQKLLAYIEEHGHGSWRSLPEKAGLQRCGKSCRLRWTNY 
LRPDIKRGKFTVQEEQTI I QLHALLGl^WSAIAT^ 

IDPVTHKHKtJETIiS S STGQS KNAATLSHMAQWESARLDAEARLARJ3 SK^ 
LNKSAAPQQHCFTQKTSTJ^TKPNQGNGDQQLESPTSTVTFSENLliMPIiGI PTDS SRNRN 
NNXTOTESSAMIE:!^^ 

PTGKNE ATAGVGNE SE YNYYEDNOIYWNS I LNLVDSSPSDSATMF* 
>G1524 (1. .825) 

ATGGGGAGAACTAAGGAGCAGGCAACATTAACTCGGTATCCACCCTGTCCTAGGAATCCT 
GCTAAATTCAATGATATAAAC?U^AGCACTCCAGGAAAAAGGATATGGTAAGGCTCTGAAA 
AGAAAACCTTGGACGGGTGTGACATGCCCTGTCTGTCTTGAGGTTCCTCACAACTCGGTC 
GTCCTCCTTTGTTCATCTTACCACAAAGGATGCCGTCCGTACATGTGTGCCACGGGAAAC 
CGTTTCTCAAATTGTCTAGAGCAGTACAAAAAGGCATATGCCAAGGATGAGAAAAGTGAC 
AAACCGCCAGAGCTATTGTGCCCGCTTTGTAGGGGTCAGGTGAAAGGCTGGACCGTTGTG 
GAAAAGGAACGTAAGTATCTGAATTCTAAGAAAAGGTCATGCATGAACGACGAGTGTTTG 
TTTTATGGAAGCTATAGACAGCTCAAGAAGCATGTTAAGGAGAACCATCCGAGAGCCAAG 
CCAAGAGCCATAGACCCTGTGCTGGAGGCGAAATGGAAGAAGCTTGAGGTTGAGAGGGAG 
AGGAGTGATGTAATCAGCACAGTCATGTCGTCAACACCTGGGGCTATGGTATTTGGAGAC 
TATGTGATTGAGCCATACAATGGTTATGATCATCAAGATGACAGTGACGATTACAGTGAT 
TCGTCGGATGACGAAATGGAAGGTGGGGTATTCGAGCTTGGAGCATTCGACCTGGGCCGT 
CTTCAACCGCGTTCGGCTGCCATCTCAAGCCGGGGAATTCGCGGTATGATCATAAGGAAC 
CGGTGGGCTCGAAGCAGAGGTGCGAGCAGAAGGCGACAAACATAA 

>G1524 Amino Ae-id Sequence (conserved domain in AA coordinates : 49-110) 
MGRTKEQATLTRYPPCPRNPAKF1TOINKALQEKGYGKALKRKPWTGVTCPVCLEVPHNSV ■ 
VLLCSSYHKGCRPYMCATGNRFSNCLEQYKKAYAKDEKSDKPPELLCPLCRGQVKGWTVV 
EKERKYIiNSKKRSCMNDECLFYGSYRQLK^ 

RSDVISTVMSSTPGAMVFGDYVIEPYNGYDHQDDSDDYSDSSDDEMEGGVFELGAFDLGR 
IiQPRSAAISSRGIRGMI IRNRWARSRGASRRRQT* 
>G1243 (1..3174) 

ATGGCGAGAAATTCGAATTCCGATGAGGCTTTCTCGTCAGAGGAGGAAGAAGAGCGGGTT 
AAGGATAATGAAGAAGAAGATGAGGAGGAGCTCGAGGCTGTTGCTCGTTCTTCTGGCTCC 
GACGATGACGAAGTAGCCGCCGCCGACGAATCACCAGTCTCCGACGGAGAGGCTGCTCCC 
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GTAGAAGATGATTACGAGGACGAAGAAGATGAGGAAAAAGCTGAAATCAGCAAACGTGAG 

AAAGCCAGACTTAAAGAGATGCAGAAGTTGAAGAAGCAGAAGATTCAAGAGATGCTGGAG 

TCGCAGAATGCTTCCATTGACGCGGATATGAACAATAAGGGAAAAGGGAGACTGAAGTAT 

CTTCTGCAGCAAACTGAGTTATTTGCCCACTTTGCTAAAAGTGATGGATCTTCTTCTCAG 

AAGAAGGCAAAAGGAAGGGGACGTCATGCTTCCAAAATAACTGAAGAGGAGGAAGACGAA 

GAGTATCTAAAGGAAGAAGAGGATGGCTTAACTGGATCTGGAAACACACGGTTACTCACA 

CAGCCCTCTTGTATTCAAGGGAAGATGAGAGATTACCAATTAGCTGGTTTGAACTGGCTC 

ATTCGTCTTTATGAGAATGGCATAAATGGAATTCTTGCTGATGAAATGGGTCTGGGGAAG 

ACGCTTCAAACGATTTCTTTGTTGGCATATCTTCATGAATACAGGGGAATCAATGGTCCC 

CATATGGTGGTTGCTCCAAAATCAACACTTGGTAATTGGATGAACGAAATTCGCCGGTTT 

TGTCCTGTCCTACGTGCTGTGAAGTTCCTTGGTAATCCTGAGGAGAGGAGACATATTCGA 

GAAGACCTGCTAGTTGCTGGGAAATTTGATATTTGTGTCACAAGCTTTGAGATGGCCATC 

AAAGAGAAGACAGCACTTCGTCGGTTTAGCTGGCGTTATATTATCATTGATGAAGCGCAT 

CGAATCAAGAACGAGAATTCACTCCTTTCTAAAACCATGAGACTTTTTAGCACCAATTAT 

CGGCTTCTTATCACGGGGACCCCCCTTCAGAATAATCTCCATGAACTGTGGGCTCTTCTA 

AATTTTCTTCTGCCTGAGATTTTTAGTTCAGCAGAGACTTTTGATGAATGGTTTCAAATT 

TCTGGTGAGAATGACCAGCAAGAAGTTGTGCAACAACTGCACAAGGTTCTTCGACCATTT 

CTTCTTCGAAGACTAAAGTCAGATGTTGAGAAAGGTTTGCCACCGAAGAAGGAGACCATA 

CTTAA^GTTGGTATGTCTCAGATGCAAAAGCAATACTACAAGGCTTTACTGCAGAAGGAT 

CTTGAAGCGGTTAATGCTGGTGGAGAACGCAAACGTCTGCTAAACATTGCAATGCAACTG 

CGTAAATGCTGCAATCACCCCTATCTCTTCCAGGGTGCAGAACCTGGTCCCCCATATACC 

ACAGGAGATCACCTTATAACAAATGCTGGTAAGATGGTTCTCTTGGATAAATTGCTTCCT 

AAGTTGAAAGAACGTGATTCAAGGGTGCTGATATTTTCTCAGATGACAAGACTTTTGGAT 

ATTCTTGAGGACTATTTAATGTATCGTGGTTACTTGTATTGCCGTATTGATGGAAACACT 

GGTGGTGACGAACGAGATGCCTCCATAGAAGCCTACAACAAGCCAGGAAGTGAGAT^ATTT 

GTTTTCTTGTTATCTACTAGAGCTGGAGGGCTTGGTATCAATCTTGCTACTGCAGATGTT 

GTGATCCTTTACGATAGTGATTGGAACCCACAAGTCGACTTGCAAGCTCAGGATCGTGCC 

CATAGGATTGGTCAAAAAAAAGAAGTTCAAGTGTTTCGATTCTGCACTGAGTCTGCTATT 

GAGGAGAAAGTGATTGAAAGAGCTTACAAGAAGTTAGCACTTGATGCTCTGGTTATTCAA 

CAAGGGAGATTGGCAGAACAGAAAAGTAAGTCTGTCAATAAGGATGAGTTGCTTCAAATG 

GTAAGATATGGTGCTGAGATGGTGTTCAGTTCTAAAGATAGCACAATCACAGACGAGGAT 

ATTGATAGAATCATTGCCAAAGGAGAAGAGGCAACAGCTGAACTTGATGCTAAGATGAAG 

AAATTCACAGAAGATGCTATACAGTTTAAAATGGATGACAGTGCTGACTTCTATGATTTT 

GATGATGACAATAAGGATGAAAACAAGCTCGATTTTAAAAAGATTGTAAGCGACAATTGG 

AATGATCCCCCCAAGCGGGAGAGAAAGCGCAACTACTCTGAATCTGAGTACTTTAAGCAA 

ACATTGCGGCAAGGTGCTCCAGCTAAACCTAAAGAGCCTAGAATTCCGCGCATGCCCCAG 

TTGCACGATTTCCAGTTCTTTAACATTCAGAGATTGACCGAGTTGTATGAAAAGGAAGTA 

CGTTATCTCATGCAAACACATCAGAAAAATCAGTTG7UUVGACACAATTGATGTTGAAGAA 

CCAGAAGGTGGGGATCCCTTAACTACTGAAGAAGTAGAAGAAAAGGAGGGATTATTGGAG 

GAGGGTTTCTCAACATGGAGCAGAAGAGATTTTAATACTTTCCTCAGGGCTTGTGAGAAG 

TATGGCCGCAACGACATAAAAAGCATTGCCTCTGAGATGGAAGGGAAAACAGAGGAAGAA 

GTTGAAAGATATGCCAAAGTATTTAAAGAGCGGTACAAGGAGCTGAACGACTATGATAGA 

ATCATTAAGAACATTGAGAGGGGAGAGGCAAGGATCTCTAGGAAAGACGAAATCATGAAG 

GCCATAGGGAAGAAACTGGATCGCTACAGAAACCCTTGGCTGGAACTGAAGATTCAATAT 

GGTCAGAACAAAGGCAAGCTGTACAATGAAGAGTGTGACCGTTTCATGATCTGCATGATT 

CACAAACTTGGTTATGGGAATTGGGATGAGCTAAAGGCAGCATTTAGGACATCGTCTGTG 

TTCAGGTTTGACTGGTTTGTGAAATCCCGCACGAGTCAGGAACTTGCAAGAAGATGCGAC 

ACTCTGATTCGACTGATCGAGAAAGAG7VACCAGGAGTTTGATGAAAGAGAGAGGCAAGCC 

CGCAAAGAGAAGAAGCTCGCGAAGAGTGCAACACCATCAAAGCGACCTTTAGGAAGACAA 

GCAAGTGAGAGTCCTTCATCGACGAAGAAGCGGAAGCACCTGTCGATGAGATGA 

>G1243 Amino Acid Sequence (domain in AA coordinates: 216-609) 

MARNSNSDEAFSSEEEEERVKDNEEEDEEELEAVARSSGSDDDEVAAAJDESPVSDGEAAP 

VEDDYEDEEDEEKAEISKREKARLKEMQKLKKQKIQEMLESQNASIDADMNNKGKGRIiKY 

LLQQTEIiFAHFAKSDGSSSQKKAKGRGRHASKITEEEEDEEYLKEEEDGLTGSGNTRLLT 

QPS CIQGKMRD YQLAGLNWL I RL YENG ING ILADEMGLGKTLQTI S IiLAYLHE YRG INGP 

HIWVAPKSTLGNWiyD^ IRRFCPVLRAV1CFLGNPEERRHIREDLLVAGKFD I CVTS FEMAI 

KEKTALRRFSWRYIIIDEAHRIKNENSLLSKTMRLFSTNYRLLITGTPLQNNLHELWALIj 
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NFLLPEIFSSAETFDEWFQISGENDQQEWQQLHKVLRPFLLRRLKSDVEKGIjPPKKETI 

LKVGMSQMQKQYYKALLQKDIiEAVNAGGERKRIiLNIAMQLRKCCNHPYLFQGAEPGPPYT 

TGDHLITNAGKMVLLDKLLPKLKERDSRVLIFSQMTRLLDILEDYLMYRGYIiYCRIDGN^ 

GGDERDASIEAYNKPGSEKFWLLSTRAGGLGINLATADVVILYDSDWNPQVDLQAQDRA 

HRIGQKKEVQVTRFCTESAIEEKVIERAYKKLAIjDALVIQQGRLAEQKSKSVNKDELLQM 

VRYGAEMVFSSKDSTITDEDIDRIIAKGEEATAELDAKMKKFTEDAIQFKMDDSADFYDF 

DDDNKDENKLDFKKIVSDNWNDPPKRERKRNYSESEYFKQTLRQGAPAKPKEPRIPRMPQ 

LHDFQFFNIQRLTELYEKEVRYLMQTHQKNQLKDTIDVEEPEGGDPLTTEEVEEKEGLLE 

EGFSTWSRRDFNTFLRACEKYGRM^IKSIASEMEGKTEEEVERYAKVFKERYKEIiNDYDR 

IIKNIERGEARISRKDEIMKAIGKKLDRYRNPWLELKIQYGQNKGKLYNEECDRFM 

HKLGYGNWDELKAAFRTSSVFRFDWFVKSRTSQELARRCDTLIRIjIEKENQEFDERERQA 

RKEKKIiAKSATPSKRPLGRQASESPSSTKKRKHLSMR* 

>G631 (190.. 1461) 

CTTCTTCTTCTTCTTCTTCTTCTTCTTCCTCCTCTCTCGTCGGATCTCTCTGATTTAGTG 

atttttcaaatttcaagttttcttcacctttaattttgtgtctcgttgatctctctttgg 
acattctgctttggattctggaggcttctcattagatctctattagtgggtttaggtcaa 
gttcttgaaatggataaggagaaatctcctgcaccaccacctagtggaggtcttcctcca 
ccatcgggtcgttactctgcgttttcacctaatggaagtagctttgcaatgaaagctgaa 
tcatcttttcctcctttgactccaagtggaagcaatagctcagatgctaaccgattcagc 
catgatattagccgaatgccggataatccacctaagaacctaggccatcgccgagctcat 
tcagagattcttactcttcctgatgacttaagctttgatagtgatcttggtgtggttggt 
gctgctgatggaccttctttctctgatgatactgacgaggacttactctatatgtatctt 
gatatggaaaaattcaattcttctgctacatcgacttctcaaatgggtgagccatcagaa 
ccgacttggaggaatgaattagcctcgacttctaaccttcagagtacacccggtagctct 
agtgaaagaccgagaattagacaccaacacagccaatcgatggatggttcaacaactatc 
aagcctgagatgcttatgtcagggaatgaagatgtgtctggagttgactctaagaaagcc 
atctctgctgctaaactttctgagcttgctctcattgatccaaaacgcgccaagaggata 
tgggcaaacaggcagtctgctgcgaggtcaaaagaaaggaagatgagatacattgcagag 
ctcgagagaaaagtacagactttacaaacagaggccacatctctctcagcccagttgact 
ctcttacagagagatacaaatggcctgggtgttgaaaacaatgagcttaaactgcgagta 

CAAACTATGGAGCAACAGGTCCACCTACAGGATGCTTTAAATGATGCACTAAAGGAGGAA 
GTCCAGCATCTTAAGGTATTGACGGGGCAAGGTCCATCAAATGGTACATCAATGAACTAC 
GGTTCTTTTGGATCAAACCAGCAATTCTATCCCAATAATCAGTCGATGCACACTATCTTA 
GCCGCACAACAGTTACAGGAGCTCCAGATCCAGTCACAGAAACAGCAACAACAACAACAG 
CAACACCAGCAACAACAACAGCAGCAGCAGCAGCAATTTCACTTTCAACAGCAGCAACTG 
TACCAGCTTCAGCAGCAGCAACGGCTTCAACAACAGGAACAACAAAGCGGGGCTTCAGAG 
CTAAGAAGACCCATGCCTTCTCCTGGTCAGAAAGAGAGTGTGACATCGCCTGATCGTGAA 
ACTCCCTTGACAAAAGACTGAGTCTAGACTGTGCTAATGTCCAATTTAGTAAGTTACTCT 
TGGAAAATCTTCTTTTTCATCGCAGGCTCATGGATTTGGGATTTACTGCATTATAGAGTT 
AAAAACAAGACAGCTTAGAAGTTGCGGATTTAGAAGTTGTTAGTGAAGCTTTTGTTCTCG 
TCTGTTGGTAGTTTACAATCTTCTCTTTGTATGATCCTAAG 

>G631 Amino Acid Sequence (domain in AA coordinates: TBD) 

MDKEKSPAPPPSGGLPPPSGRYSAFSPNGSSFAMKAESSFPPLTPSGSNSSDANRFSHDI 

SRMPDNPPKNIjGHRRAHSE ILTLPDDLS FDSDLG WGAADGPS FSDDTDEDLLYMYXiDME 

KFNSSATSTSQMGEPSEPTWRNELASTSNLQSTPGSSSERPRIRHQHSQSMDGSTTIKPE 

MLMSGNEDVSGVDSKXAISAAKIjSELALIDPKJ^^ 

K^QXLQTEATSLSAQLTLLQRDTNGLGVEW^ 

LK^TGQGPSNGTSMNYGSFGSNQQFYPNNQSMHTILAAQQLQQIiQIQSQKQQQQQQQHQ 
QQQQQQQQQFHFQQQQLYQIiQQQQRLQQQEQQSGASELRRPMPSPGQKESVTSPDRETPL 
TKD* 

>G1909 (1..828) 

ATGGGTGGATCGATGGCGGAGAGAGCAAGGCAGGCCAACATTCCTCCACTAGCGGGACCC 
CTAAAGTGTCCTCGATGCGACTCCAGCAACACTAAGTTCTGTTACTACAACAACTATAAC 
CTCACTCAGCCTCGTCACTTCTGCAAAGGTTGCCGTCGCTACTGGACACAAGGGGGCGCC 
CTGAGAAACGTCCCTGTAGGTGGAGGCTGCCGGAGGAATAACAAGAAGGGCAAAAATGGA 
AATTTAAAATCTTCTTCTTCTTCGTCCAAACAGTCTTCCTCGGTCAACGCTCAAAGTCCT 
AGCTCAGGACAGCTAAGGACAAATCATCAGTTC CCTTTTTCAC CAACTCTTTACAATCTC 
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ACTCAACTCGGAGGTATTGGTTTGAACTTAGCCGCTACTAATGGCAACAACCAAGCTCAC 
CAGATCGGTTCCAGTTTGATGATGAGCGATCTAGGGTTTCTCCATGGACGAAATACTTCA 
ACTCCGATGACGGGAAACATTCATGAAAACAACAACAATAATAACAATGAAAACAACCTA 
ATGGCATCCGTTGGATCTTTGAGCCCCTTTGCTCTCTTCGATCCAACGACGGGGCTATAC 
GCTTTCCAGAACGACGGTAATATCGGGAACAACGTTGGGATATCTGGTTCTTCTACTTCC 
ATGGTTGATTCTAGGGTTTATCAGACGCCTCCGGTGAAGATGGAAGAACAACCTAATTTG 
GCTAACTTGTCTAGACCGGTCTCCGGTTTGACGTCTCCTGGGAATCAAACAAATCAGTAC 
TTTTGGCCTGGTTCGGATTTCTCGGGTCCTTCTAATGATCTCTTGTGA 

>G1909 Amino Acid Sequence (conserved domain in AA coordinates : 23-51) 

MGG SMAERARQANI PPLAGPLKCPRCDS SNTKFCY YNNYNLTQPRHFCKGCRRYWTQGGA 

LRNVPVGGGCRRNNKKGKNGNXiKS SSSSSKQSS S WAQSPS S GQLRTNHQFPFS PTLYNTj 

TQLGGIGLNLAATNGNNQAHQIGSSLMMSDIiGFLH^ 

MASVGSLSPFALFDPTTGLYAFQNDGNIGETt^^ 

ANLSRPVSGLTSPGNQTNQYFWPGSDFSGPSNDLL* 

>G1663 (64.. 630) 

TTCTCTCTGTGAATCCTTGTTCATCGTCACTGAAATTAGTTTACAAAATCGACGAATTCG 

GAGATGATTTTTCAGAATGTGTGCAGAAATGAGTCCAACTTCAACGCTATAGCTTCCGAA 

TCGCGTTCCCAAACGCAGTTCGGTGTTTCGAAATCCTCCTCGAGCGGCGGCGGATGTATC 

TCCGCCAGGACTAAAGACCGTCACACGAAGGTTAACGGACGAAGCCGTCGAGTTACGATG 

CCGGCTCTCGCCGCCGCTAGGATTTTCCAGTTAACGCGTGAGCTCGGTCACAAAACTGAA 

GGAGAAACCATCGAATGGCTTCTTAGTCAAGCTGAACCGTCGATTATTGCCGCCACTGGC 

TACGGGACTAAGCTCATTTCGAATTGGGTTGATGTTGCGGCGGACGATTCCTCGTCGTCG 

TCGTCGATGACGTCGCCGCAAACGCAAACGCAAACGCCACAATCGCCGAGTTGTAGGTTG 

GATCTTTGTCAGCCAATCGGAATTCAGTATCCGGTGAATGGTTACAGTCATATGCCGTTC 

ACAGCGATGCTTTTAGAGCCGATGACCACGACGGCGGAATCTGAGGTTGAGATCGCGGAG 

GAGGAGGAACGTAGACGCCGTC^CCATTAGTAAAATTAGGCTTTTGATTTAGAGTGTTAA 

AATTAGGATTTTAAAAGTTTAGGAGGTAACAGATAAGGATAATT 

>G1663 Amino Acid Sequence (domain in AA coordinates : TBD) 

MIFQNVCRNESNFNAIASESRS QTQFGVSKSS S SGGGC ISARTKDRHTKVNGRSRRVTMP 

ALAAARIFQLTRELGHKTEGETIEWLLSQAEPSIIAATGYGTKLIS2STWVDVAADDSSSSS 

SMTSPQTQTQTPQSPSCRLDLCQPIGIQYPVNGYSHMPFTAMLLEPMTTTAESEVEIAEE 

EERRRRHH* 

>G1231 (103.. 870) 

CAAACCCAAATTCTCTCAGCGCCGGTCAAATACTTGTCTCTCTCTCTCTCTCTCTTTCAC 
TCTTGTCTTGTCTCCTTCGAAGCTGTTTGTTCTGTAAGAAAGATGGAAGCAGGTGGCGCG 
TACAATCCACGCACTGTTGAAGAGGTGTTTAGGGATTTTAAGGGTCGTAGAGCTGGCATG 
ATTAAGGCTTTAACCACTGATGTTCAGGAGTTTTTCCGACTTTGTGATCCCGAAAAGGAG 
AACCTTTGCCTTTACGGACATCCAAATGAGCACTGGGAAGTGAATTTGCCAGCTGAAGAG 
GTTCCTCCTGAGCTCCCAGAGCCTGTCTTGGGTATCAATTTTGCCAGAGACGGGATGGCG 
GAAAAGGATTGGTTGTCCCTTGTTGCTGTCCACAGTGATGCTTGGCTTCTTGCTGTTGCT 
TTCTTTTTTGGAGCCAGGTTTGGATTTGACAAAGCTGATAGGAAGAGGCTTTTCAATATG 
GTGAATGACCTCCCAACAATCTTTGAGGTTGTAGCTGGCACTGCTAAGAAACAAGGAAAA 
GATAAGTCGTCTGTTTCCAACAACAGCAGCAACAGATCCAAATCAAGCTCCAAGCGAGGA 
TCTGAATCCCGTGCCAAGTTCTCAAAGCCGGAGCCCAAAGATGATGAGGAGGAGGAAGAG 
GAAGGTGTGGAAGAGGAGGATGAGGATGAGCAAGGTGAAACACAGTGTGGAGCATGTGGT 
GAGAGCTATGCAGCTGATGAGTTCTGGATTTGCTGTGACCTCTGTGAGATGTGGTTTCAT 
GGAAAGTGTGTTAAGATAACACCAGCAAGAGCTGAGCACATCAAGCAATACAAGTGCCCT 
TCTTGCAGCAACAAAAGGGCTCGTTCCTAAATTTGT.TGACCGCTCGCTTCTGTGTATCTA 
CCTTTGCATATGATGATGAACAGCTTAACTGTTTGGTTTAGATCAGATTTGTCATATGGA 
TTTGGTAATTTAGGAAGACATTTTAGTTTTTTCATTGTTACATTTTGGCGATTGAAGGGA 
TAACTCTTTGTTTAGGGGTAATGATCTTTTGCTCTGTTTTATGTTTGTTTATTAACATTC 
TTCAAACTCAATCAAAAGTATTTTGGTTAGTCTTAAAA 

>G1231 Amino Acid Sequence (domain in AA coordinates: TBD) 
MEAGGAYNPRTVEEVFRDFKGRRAGMIKALTTDVQEFFRLCDPEKENLCLYGHPNEHWEV 
NLPAEEVPPELPEPVXGINFARDGMAEKDWLSLVAV 
KRLFNMVNDLPTIFEWAGTAKKQGKDKSSVSHN^ 

DEEEEEEGVEEEDEDEQGETQCGACGESYAADEFWICCDLCEMWFHGKCVKITPARAEHI 
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KQYKCPSCSNKRARS * 
>G227 (21. .983) 

GTACCGTCGACGATCCGGCGATGTCAAACCCGACCCGTAAGAATATGGAGAGGATTAAAG 
GTCCATGGAGTCCAGAAGAAGATGATCTGTTGCAGAGGCTTGTTCAGAAACATGGTCCGA 
GGAACTGGTCTTTGATTAGCAAATCAATCCCTGGACGTTCCGGCAAATCTTGTCGTCTCC 
GGTGGTGTAACCAGCTATCTCCGGAGGTAGAGCACCGTGCTTTTTCGCAGGAAGAAGACG 
AGACGATTATTCGAGCTCACGCTCGGTTTGGTAACAAGTGGGCTACGATCTCTCGTCTTC 
TCAATGGACGAACCGATAACGCTATCAAGAATCATTGGAACTCGACGCTGAAGCGAAAAT 
GCAGCGTCGAAGGGCAAAGTTGTGATTTTGGTGGTAATGGAGGGTATGATGGTAATTTAG 
GAGAAGAGCAACCGTTGAAACGTACGGCGAGTGGTGGTGGTGGTGTCTCGACTGGCTTGT 
ATATGAGTCCCGGAAGTCCATCGGGATCTGACGTCAGCGAGCAATCTAGTGGTGGTGCAC 
ACGTGTTTAAACCAACGGTTAGATCTGAGGTTACAGCGTCATCGTCTGGTGAAGATCCTC 
CAACTTATCTTAGTTTGTCTCTTCCTTGGACTGACGAGACGGTTCGAGTCAACGAGCCGG 
TTCAACTTAACCAGAATACGGTTATGGACGGTGGTTATACGGCGGAGCTGTTTCCGGTTA 
GAAAGGAAGAGCAAGTGGAAGTAGAAGAAGAAGAAGCGAAGGGGATATCTGGTGGATTCG 
GTGGTGAGTTCATGACGGTGGTTCAGGAGATGATAAGGACGGAGGTGAGGAGTTACATGG 
CGGATTTACAGCGAGGAAACGTCGGTGGTAGTAGTTCTGGCGGCGGAGGTGGCGGTTCGT 
GTATGCCACAAAGTGTAAACAGCCGTCGTGTTGGGTTTAGAGAGTTTATAGTGAACCAAA 
TCGGAATTGGGAAGATGGAGTAGGCGGCC 

>G227 Amino Acid Sequence (domain in AA coordinates: 13-112) 
MSNPTRKNMERIKGPWS PEEDDLLQRLVQKHGPRNWSL I SKS I PGRSGKS CRLRWCNQkS 
PEVEHRAFSQEEDETIIRAHARFGKTKWATISRLIiN^^ 

CDFGGNGGYDGNLGEEQPLKRTASGGGGVSTGIiYMSPGSPSGSDVSEQSSGGAHVFKPTV 
RSEVTASSSGEDPPTYLSLSLPWTDETWVNEPVQLNQNTVMDGGYTAELFPVRKEEQVE 
VEEEEAKGISGGFGGEFMTWQEMIRTEVRSYMADLQRGNVGGSSSGGGGGGSCMPQSVN 
SRRVGFREF I VNQ I G I GKME * 
>G1842 (219.. 809) 

ACTATTACATGCCTCTTCCTCGCTTCAAAACGGCACCGTTTCCACTTGTTATTATTTTTC 

TCTCTATCGTCTAACAAAAAAAAAAACTGACTTGGGATTTTTTTTCATTTGTCTAGCCCA 

AAAGJ\AGAAGATAGAAACGAAGAAAAAAAGCAAACACATTTTGGGTCCCCGGTGGTTAGG 

ATCAAATTAGGGCACAAACCTTATCGGAGAAAGAAGCCATGGGAAGAAGAAAAGTCGAGA 

TCAAGCGAATCGAGAACAAAAGCAGTCGACAAGTCACTTTCTCCAAACGACGCAAAGGTC 

TCATCGAAAAAGCTCGACAACTTTCAATTCTCTGTGAATCrrCGATCGCTGTTGTC 

TCTCCGGTTCCGGAAAACTCTACGACTCTGCCTCCGGTGACAACATGTCAAAGATCATTG 

ATCGTTATGAAATACATCATGCTGATGAACTTAAAGCCTTAGATCTTGCAGAAAAAATTC 

GGAATTATCTTCCAGACAAGGAGTTACTAGAAATAGTCCAAAGCAAGCTTGAAGAATCAA 

ATGTCGATAATGTAAGTGTAGATTCTCTAATATCTATGGAGGAACAGCTCGAGACTGCTC 

TGTCAGTAATTAGAGCTAAGAAGACAGAACTAATGATGGAGGATATGAAGTCACTTCAAG 

AAAGGGAGAAGTTGCTGATAGAAGAGAACCAGATTCTGGCTAGCCAGGTGGGGAAGAAGA 

CGTTTCTGGTTATAGAAGGTGACAGAGGAATGTCACGGGAAAATGGCTCCGGCAACAAAG 

TACCGGAGACTCTTTCGCTGCTCAAGTAATCACCATCATCAACGGGTGAGCTTTCACCAT 

AAACTTACTCACAGCCTGATTCAGAAGCTTTTACAAAATTGTAAATTATAAAAAGCTGCA 

TAATAATCTCAACCTTTTTATCTTCCTCGCGCCAATGTGGAAATAAAGGTAAAACAAAAC 

GAAGCTCTTTTCTTTTATGCGAAAGAATTGTAAAACTAAGATAAAGCTACCGATCTTTGT 

TGTACCTTAGTAGACAAATATCAGAGTTCTTGTGCTTGT 

>G1842 Amino Acid Sequence (domain in AA coordinates: 2-57) 
MGRRKVEIK^IENKSSRQWFSKRRKGIilEKARQLSILCESSIAWAVSGSGKLYDSASG 
DNMSKI IDRYEIHH^DELKALDLAEKIRNYLPHKE 

EEQIiETAIiSVIRAKKTELMMEDMKSLQEREKLLIEENQILASQVGKKTFLVIEGDRGMSR 

ENGSGNKVPETLSIiLK* 

>G1505 (1..681) 

ATGGATGATATAGCGGAACTTGAATGGTTATCAAATTTCGTAGATGATTCTTCTTTCACG 
CCGTATTCTGCTCCGACGAATAAACCGGTTTGGTTAACCGGAAATCGGAGACATCTTGTA 
CAACCGGTTAAAGAGGAGACCTGCTTCAAATCCCAACATCCGGCCGTCAAAACCAGACCC 
AAACGAGCCAGAACCGGAGTCAGAGTCTGGTCTCATGGTTCGCAGTCGTTAACCGACTCA 
TCTTCAAGCTCTACAACATCTTCGTCGTCCTCTCCTCGTCCTTCAAGCCCTCTATGGCTC 
GCCAGCGGTCAGTTTCTTGATGAGCCAATGACTAAAACACAAAAGAAGAAGAAAGTTTGG 



269 



WO 03/013227 



270/286 



PCT/US02/25805 



AAAAACGCTGGTCAGACGCAAACGCAAACGCAGACGCAGACGCGGCAGTGTGGTCATTGT 
GGAGTTCAGAAAACGCCGCAGTGGAGAGCAGGACCATTAGGAGCGAAGACGTTGTGTAAT 
GCGTGTGGTGTGCGTTACAAATCGGGTCGGTTACTACCCGAATATAGACCCGCTTGTAGC 
CC/y^CATTTTCGAGTGAGCTTCACTCAAACCACCACAGTAAAGTCATTGAGATGCGTAGG 
AAGAAAGAGACTTCTGACGGTGCTGAAGAAACCGGTTTGAACCAGCCGGTTCAGACGGTT 
CAGGTTGTCTCGAGTTTTTGA 

>G1505 Amino Acid Sequence (domain in AA coordinates: TBD) 

MDDIAELEWLSNFVDDSSFTPYSAPTNKPVWLTGNRRHLVQPVKBETCFKSQHPAVKTRP 

KRARTGVRWSHGSQSLTDSSSSSTTSSSSSPRPSSPLWLASGQFLDEPMTKTQKKKKVW 

KNAGQTQTQTQTQTRQCGHCGVQKTPQWRAGPLGAKTIiCNACGVRYKSGRIiliPEYRPACS 

PTFS SELHSNHHS KVI EMRRKKETSDGAEETGLNQP VQTVQ WS S F * 

>G657 (1..2331) 

ATGAAGCGTGAGATGAAAGCACCTACTACTCCACTAGAGAGTCTCCAAGGTGACCTCAAA 
GGAAAACAAGGGAGGACATCTGGCCCTGCTAGACGATCTACCAAAGGACAATGGACACCT 
GAAGAGGACGAAGTCTTGTGTAAAGCTGTTGAGCGTTTTCAAGGAAAGAACTGGAAGAAG 
ATAGCTGAATGTTTTAAGGATCGGACTGATGTTCAGTGTCTTCATAGATGGCAAAAGGTC 
TTGAACCCAGAGCTTGTGAAAGGACCGTGGTCAAAAGAGGAGGATAACACAATAATTGAC 
CTGGTTGAAAAATATGGGGCAAAGAAATGGTCTACTATATCTCAGCATTTACCTGGGCGC 
ATAGGAAAGCAATGTAGGGAAAGGTGGCATAACCATCTTAACCCTGGGATTAATAAAAAT 
GCATGGACTCAGGAAGAGGAACTGACTCTTATTCGTGCGCATCAAATTTATGGGAATAAA 
TGGGCAGAGCTTATGAAATTTTTGCCAGGAAGGTCAGATAATTCGATAAAAAATCATTGG 
AACAGCTCAGTTAAGAAGAAGTTGGATTCCTACTATGCATCAGGTCTTTTAGATCAGTGT 
CAAAGCTCGCCATTAATTGCCCTTCAGAAGAAATCTATCGCTTCATCTTCCTCGTGGATG 
CACAGCAATGGAGATGAAGGTAGTTCAAGGCCAGGGGTTGATGCTGAGGAATCAGAATGC 
AGCCAAGCTTCAACTGTTTTCTCACAAT<^ 

GGAAATGAGGAATATTACATGCCTGAATTTCATTCAGGAACGGAGCAGCAAATCTCAAAC 

GCTGCATCTCATGCAGAACCGTACTACCCTTCCTTTAAAGATGTCAAAATTGTTGTCCCC 

GAAATTTCTTGTGAAACAGAATGTTCGAAGAAGTTTCAGAATCTTAATTGTTCTCACGAG 

CTAAGAACTACCACAGCTACGGAGGATCAATTGCCGGGTGTATCTAATGATGCTAAACAG 

GACCGTGGTCTAGAGTTATTGACCCATAACATGGACAACGGTGGAAAAAACCAAGCACTT 

CAAC^GATTTTCAAAGTTCAGTAAGATTAAGTGATCAACCTTTTTTGTCAAACTCGGAC 

ACAGATCCAGAAGCTC?VAACTTTGATCACGGATGAGGAGTGTTGTAGGGTTCTTTTTCCA 

GATAACATGAAAGATAGCAGTACATCTTCTGGTGAGCAAGGTCGGAATATGGTTGACCCT 

CAAAACGGCAAAGGATCTCTTTGTTCTCAGGCTGCAGAAACCCATGCTCATGAAACTGGA 

AAAGTTCCAGCTTTACCGTGGCATCCTTCAAGTTCTGAGGGCCTGGCGGGTCATAATTGT 

GTCCCTTTGTTGGATTCAGACTTGAAGGACTCACTTTTACCCCGTAATGATTCCAACGCT 

CCTATACAAGGTTGTCGCCTTTTTGGAGCTACCGAATTAGAATGTAAGACTGATACAAAT 

GACGGTTTCATCGATACTTACGGACATGTAACTTCCCATGGCAATGATGATAATGGTGGT 

TTCCCAGAACAACAGGGGCTGTCATATATTCCCAAGGATTCTTTGAAGCTAGTACCTTTG 

AATAGTTTTTCTTCTCCTTCTAGAGTGAACAAGATTTATTTTCCTATTGACGATAAGCCG 

GCTGAAAAAGACAAAGGAGCTCTTTGTTATGAACCTCCACGTTTTCCAA6TGCAGATATT 

CCTTTCTTCAGCTGTGATCTTGTACCATCAAATAGTGACTTACGGCT^AGAGTACAGTCCC 

TTTGGTATCCGTCAGTTGATGATTTCTTCAATGAATTGTACAACTCCGTTAAGGTTATGG 

GATTCACCGTGTCACGATAGGAGCCCTGATGTCATGCTTAATGATACTGCCAAAAGTTTT 

AGTGGTGCACCATCCATCTTAAAGAAGCGGCATCGAGACTTGCTTTCACCTGTGCTTGAT 

AGAAGAAAAGACAAAAAGCTTAAAAGGGCTGCGACTTCCTGCTTGGCTAATGATTTTTCG 

CGCTTAGATGTAATGCTTGATGAAGGAGATGATTGCATGACCTCTCGTCCGTCAGAGTCT 

CCTGAAGATAAAAA^ATATGTGCCTCCCCTTCCATAGCGAGAGATAACAGAAATTGTGCA 

TCAGCTCGGTTATATCAAGAAATGATTCCGATAGATGAGGAACCAAAGGAAACCTTAGAA 

TCAGGTGGAGTGACTTCTATGCAAAATGAAAATGGATGTAATGACGGTGGTGCTTCAGCT 

AAAAATGTAAGTCCGTCTTTGTCCTTGCATATTATCTGGTATCAGTTATAA 

>G657 Amino Acid Sequence (domain in AA coordinates: TBD) 

MKREMKAPTTPLESLQGDLKGKQGRTSGPARRSTKGQWTPEEDEVIiCKAVERFQGKNWKK 

IAECFKDRTDVQCLHRWQKVLNPEL^ 

IGKQCRERWHNHLNPGINKNAWTQEEELTL IRAHQ I YGNKWAELMKFLPGRSDNS I KNHW 
NSSVKKKLDSYYASGLLDQCQSSPLIALQNKSIASSSSWMHSNGDEGSSRPGVDAEESEC 
SQASTVFSQSTNDLQDEVQRGNEEYYMPEFHSGTEQQISNAASHAEPYYPSFKDVKIWP 
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EISCETECSKKFQNIiNCSHELRTTTATEDQIiPGVSNDAKQDRGLELLTHNMDNGGKNQAL 
QQDFQSSVRLSDQPFLSNSDTDPEAQTLITDEECCRVLFPDNMKDSSTSSGEQGRNMVDP 
QNGKGSLCSQAAETHAHETGKVPALPWHPSSSEGIiAGHNCVPriLDSDLKDSLLPRNDSNA 
PIQGCRLFGATELECKTDTNDGFIDTYGHVTSHGNDDNGGFPEQQGLSYIPKDSLKLVPL 
NSFSSPSRVNKIYFPIDDKPAEKDKGALCYEPPRFPSADIPFFSCDLVPSNSDLRQEYSP 
FGIRQLM I S S MNCTTPLRLWDS PCHDRS PDVMLNDTAKS FS GAPS I LKKRHRDLLSPVLD 
RRKDKKLKRAATS S L ANDFSRLDVMLDEGDDCMTS RPS ES PEDKNI CAS PS IARDNRNCA 
SARLYQEMIPIDEEPKETLESGGVTSMQNENGO^GGASAKNVSPSLSLHIIWYQIj* 
>G1959 (141. .1028) 

CGTCGACTGTCCATAAATCCGGAGCCTGACCCGACGTTTGACCCGGATCCGAAACTCCCA 
CAATCTCCATACCACCCAAATTCATCTCCCCTAAAGCTTTCTCTCACTTTCCCGGGAAAA 
TCGGCGACCAAAATTGGAAAATGTACTCAGCGATTCGCTCGCTTCCACTCGATGGTGGAC 
ACGTTGGTGGTGACTACCATGGACCTCTTGACGGAACCAATCTTCCCGGTGACGCTTGTT 
TGGTTTTAACGACTGACCCTAAACCTCGTCTCCGGTGGACAACTGAGCTTCATGAGAGAT 
TCGTTGACGCCGTTACTCAGCTCGGTGGTCCTGACAAAGCGACTCCCAAAACTATTATGA 
GAACAATGGGAGTGAAGGGTCTCACTCTCTACCACCTCAAATCACATCTTCAGAAATTCC 
GCCTAGGGAGGCAAGCTGGCAAAGAATCAACTGAGAACTCTAAAGATGCTTCTTGTGTAG 
GGGAGAGTCAGGACACAGGTTCATCTTCGACATCATCAATGAGAATGGCGCAGCAGGAGC 
AGAACGAGGGTTACCAAGTCACCGAAGCTCTACGTGCTCAGATGGAAGTCCAAAGAAGAC 
TACACGATCAATTGGAGGTGCAACGGAGGCTCCAGCTGAGGATAGAGGCACAAGGAAAAT 
ACCTGCAATCGATTCTTGAAAAAGCTTGCAAGGCCTTTGACGAGCAAGCTGCTACTTTTG 
CTGGACTTGAGGCTGCTAGGGAAGAGCTATCAGAGCTAGCCATCAAAGTCTCCAATAGCT 
CTCAAGGAACATCAGTCCCGTACTTCGATGCAACAAAGATGATGATGATGCCATCGTTGT 
C^GAGCTTGCAGTAGC^TAGACAACAAAAACAACATC^^ 

GCTCTCTGACTTCCATCACACATGGGAGCTCTATATCTGCTGCATCAATGAAGAAGCGTC 
AACGTGGAGACAATTTGGGCGTAGGGTATGAATCAGGCTGGATTATGCCTAGTAGCACCA 
TTGGATAAAGTTTAGGAGAGGGAAAAAGTTCATTATGGGAAAGGTAGAGATAAGATTTAA 
CTGTTCTTTACTTGCTTTGAGGGGCCTGCGGCCGCT 

>G1959 Amino Acid Sequence (conserved domain in AA coordinates : 46-97) 

OTSAIRSLPLDGGHVGGDYHGPLDGTNLPGDACLVIjTTDPKPRLRWTTELHERFVDA 

LGGPDKATPKTIMRTMGVKGLTLYHLKSHLQKFRLGRQAGKE S TENSKDAS CVGES QDTG 

S S S TS SMRMAQQEQNEGYQVTEALRAQMEVQRRLHDQLE VQRRLQLRI EAQGKYLQS I LE 

KACKAFDEQAATFAGLEAAREELSELAI KVSNS S QGTS VPYFDATKMMMMP SL SEIiAVAI 

DimmiTTNCSVESSLTSITHGSSISAASMKKRQRGDNLGVGYESGWIMPSSTIG* 

>G2180 (1..1440) 

ATGGCTCCTGTCTCGTTACCTCCAGGTTTCCGATTCCATCCAACAGACGAGGAACTAATT 
ACTTACTATCTAAAAAGAAAGATCAACGGTCTAGAAATCGAACTTGAAGTTATCGCTGAA 
GTTGATCTTTACAAGTGTGAGCCATGGGACTTACCAGGGAAGTCCTTGCTTCCGAGCAAA 
GACCAAGAATGGTACTTCTTCAGCCCACGAGACCGGAAGTATCCCAACGGCTCAAGGACA 
AACCGGGCAACTAAAGGCGGTTATTGGAAGGCTACAGGTAAAGACCGCCGAGTTAGTTGG 
AGAGACCGAGCCATAGGAACCAAGAAGACATTGGTTTACTACCGTGGGCGCGCGCCACAT 
GGCATAAGAACTGGTTGGGTCATGCACGAATATCGACTTGATGAAACAGAATGTGAGCCT 
TCTGCATACGGCATGCAGGACGCATATGCACTTTGTCGTGTGTTCAAAAAGATTGTTATT 
GAAGCTAAGCCAAGAGATCAACATCGGTCATATGTCCACGCGATGTCGAATGTGAGTGGT 
AATTGCTCATCGAGTTTTGACACTTGTTCGGATCTCGAAATCAGTTCAACTACTCATCAA 
GTTCAAAACACATTCCAACCGCGATTTGGCAACGAGCGATTTAACTCCAACGCAATCAGC 
AACGAGGATTGGTCACAATACTACGGTTCTTCTTATAGACCGTTCCCTACTCCATATAAG 
GTTAACACAGAGAT-GGAATGTTCAATGTTACAACACAATATATATCTACCACCGTTGCGT 
GTAGAGAACTCTGCGTTTAGTGATTCCGATTTCTTCACGAGTATGACTCACAACAACGAC 
CATGGCGTTTTCGATGACTTTACTTTTGCTGCAAGTAACTCCAACCACAATAATAGCGTT 
GGTGATCAAGTGATCCACGTTGGCAATTATGATGAACAATTAATAACATCTAACCGTCAT 
ATGAACCAGACTGGTTATATAAAAGAGCAGAAGATCAGATCGAGTTTGGATAATACTGAC 
GAAGATCCAGGATTTCATGGTAACAATACCAATGACAACATAGATATCGATGATTTTCTC 
TCGTTTGATATATATAACGAGGACAACGTGAATCAAATAGAAGATAATGAAGACGTGAAT 
ACAAATGAAACCCTTGATTCATCGGGATTCGAGGTGGTTGAAGAAGAAACTAGATTTAAC 
AACCAAATGCTCATCTCGACATATCAAACGACAAAGATTCTATATCACCAAGTCGTACCT 
TGTCACACGTTGAAAGTTCACGTCAATCCTATTAGTCACAATGTGGAAGAGAGAACATTG 



271 



WO 03/013227 



272/286 



PCT/US02/25805 



TTCATTGAAGAGGACAAAGATTCTTGGTTACAAAGAGCTGAGAAGATCACGAAGACAAAA 

CTAACACTTTTTAGTTTAATGGCTCAGCAATACTACAAATGTCTTGCTATTTTTTTCTGA 

>G2180 Amino Acid Sequence (conserved domain in AA coordinates : 7-156) 

MAPVSLPPGFRFHPTDEEIiITYYLKRKINGLEIELEVIAEVDLYKCEPWDLPGKSLLPSK 

DQEWYFFSPRDRKYPNGSRTNRATKGGYWKATGKDRRVSWRDRAI^ 

GIRTGW/MHEYRLDETECEPSAYGMQDAYALCRVT^ 

NCSSSFDTCSDIiEISSTTHQVQNTFQPRFGNERFNSNAISNEDWSQYYGSSYRPFPTPYK 
vOTEIECSMLQHNIYLPPLRVENSAFSDSDFFTSMTHNl^^ 

GDQVIHVGNYDEQLITSNRHMNQTGYIKEQKIRSSLDNTDEDPGFHGNWTlTONIDIDDFIj 
S FD I YNEDNVNQ I EDNED VNTNETLDS S GFEWEEETRFNNQMLI S T YQTTKI LYHQWP 
CHTLKVHVNPISHNVT2ERTLFIEEDKI)SWL^ 
>G1817 (1. .1308) 

ATGAAGGACGCAGAGAAGCGAGAGGTGATTGCATCATCATCATTACAAAGAAAGAGAAAC 
AGAGGAAGAAGACTAAGGAAAAGAAGAAGAAGAAACGAGAAGCGAGTACTAATGGTTCCA 
TCATCATTACCAAACGACGTGCTAGAGGAGATCTTTTTAAGATTTCCGGTTAAAGCCCTA 
ATCCGACTCAAGTCTCTCTCGAAACAATGGAGATCGACGATCGAATCTCGCAGTTTTGAA 
GAGAGACACTTGACGATCGCTAAGAAAGCCTTCGTGGATCATCCCAAGGTCATGCTCGTA 
GGAGAAGAAGATCCCATAAGAGGAACCGGGATTCGTCCAGACACTGACATTGGTTTTAGG 
TTATTCTGCTTGGAATCGGCTTCTCTTCTATCCTTTACTCGTCTCAATTTCCCTCAAGGG 
TTCTTCAACTGGATCTACATATCTGAAAGCTGTGATGGCCTTTTCTGCATCCATTCCCCA 
AAATCACATTCCGTATATGTAGTGAATCCGGCTACACGGTGGCTCCGCCTACTTCCTCCG 
GCAGGGTTTCAGATTTTGATCCACAAGTTTAACCCCACTGAACGTGAGTGGAATGTAGTG 
ATGAAATCAATCTTTCATCTAGCATTCGTGAAGGCCACCGATTACAAATTAGTGTGGTTG 
TACAATTGTGATAAGTACATTGTTGATGCGTCGAGTCCAAACGTGGGAGTCACAAAGTGC 
GAGATTTTTGACTTTAGGAAAAATGCTTGGAGGTACTTGGCTTGCACTCCAAGTCATCAG 
ATATTCTATTACCAAAAGCCAGCATCTGCAAACGGGTCGGTTTATTGGTTTACAGAACCA 
TATAATGAAAGAATCGAAGTAGTGGCTTTTGATATTCAGACCGAAACATTCCGGTTGCTG 
CCTAAGATTAATCCGGCTATTGCTGGTTCAGATCCTCACCATATTGACATGTGCACTCTG 
GATAATAGTTTGTGTATGTCGAAAAGGGAGAAAGATACTATGATCCAAGATATTTGGAGG 
TTGAAACCATCAGAAGACACATGGGAAAAGATTTTTAGGATAGACTTGGTTTCCTGTCCT 
TCTTCTCGGACTGAGAAGCGTGATCAATTTGATTGGAGCAAGAAGGATAGGGTTGAGCCA 
GCCACACCCGTCGCGGTTTGTAAGAATAAGAAGATCCTTCTCTCACATCGCTATTCCCGA 
GGTTTGGTAAAGTACGATCCCCTAACAAAATCTATCGATTTTTTTTCCGGACATCCTACC 
GCTTACAGAAAAGTTATTTATTTTCAAAGTTTGATATCTCATCTATAA 

>G1817 Amino Acid Sequence (conserved domain in AA coordinates : 47-331) 

MKDAEKREVI AS S SLQRKRNRGRRLRKRRRRNEKRVLMVPS SLPNDVLEE I FLRFPVKAL 

IRLKSLSKQWRSTIESRSFEERHLTIAKKAFVDHPK^LVGEEDPIRGTGIRPDTDIGFR 

LFCLESASLLSFTRLNFPQGFFNWIYISESCDGLFCIHSPKSHSv-YVWPATRWLRIiLPP 

AGFQ IL IHKFNPTEREWNVVMKS I FHLAFVKATD YKLVWLYNCDKYI VDAS S PNVGVTKC 

EIFDFRKNAWRYXACTPSHQIFYYQKPASANGSVYWFTEPYNERIEVVAFDIQTETFRI»L 

PKINPAIAGSDPHHIDMCTIjDNSLCMSKREKDTMIQDIWRIjKPSEDTWEKIFSIDIiVSCP 

SSRTEKRDQFDWSKKBRV^PATPVAVCKNKKILLSHRYSRGLVKYDPLTKSIDFFSGHPT 

AYRKVIYFQSLISHL* 

>G1649 (61-. 1311) 

ATTCACAAAAACCGGAAAAAAAAAAAGACAAGTAAAGAAAGCTTTGTTCAGTTTACTTCA 
ATGGAAGCAAAACCCTTAGCATCATCATCATCTGAACCAAACATGATTTCTCCATCATCA 
AACATTAAACCAAAATTAAAAGATGAAGATTATATGGAGCTGGTGTGTGAAAATGGGCAG 
ATTCTTGCAAAGACTCGAAGACCAAAGAACAACGGTTCTTTTCAAAAGC^^CGTAGGCAA 
TCTCTCCTGGATTTGTATGAGACCGAGTACAGCGAGGGTTTCAAGAAAAACATCAAGATT 
CTTGGAGACACACAAGTTGTTCCGGTGAGTCAGTCTAAGCCACAACAAGATAAAGAAACC 
AATGAACAAATGAACAACAATAAGAAGAAGCTAAAGTCCTCCAAAATCGAATTTGAGAGA 
AATGTTTCGAA7UVGCAACAAATGTGTTGAATCATCAACATTAATTGATGTTTCTGCTAAA 
GGTCCAAAGAATGTTGAAGTTACTACAGCTCCTCCTGATGAGCAATCTGCAGCTGTTGGT 
AGATCCACGGAATTGTATTTTGCTTCTTCATCGAAGTTTTCTCGAGGAACTTCGAGAGAT 
CTAAGTTGTTGTTCTTTAAAGAGGAAGTATGGAGATATTGAAGAAGAAGAATCAACCTAT 
TTAAGTAATAATTCAGATGATGAATCAGATGATGCGAAGACACAAGTTCATGCGAGAACA 
AGAAAGCCGGTGACTAAAAGAAAACGAAGCACAGAAGTCCATAAGTTATATGAAAGAAAA 
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CGAAGAGATGAATTCAACAAGAAAATGCGTGCTTTGCAGGACCTACTACCAAATTGTTAC 
AAGGATGATAAGGCTTCATTGTTGGATGAGGCTATCAAATATATGCGGACCCTTCAACTT 
CAAGTTCAGATGATGAGTATGGGAAATGGATTAATAAGACCACCTACGATGTTGCCAATG 
GGTCATTACTCTCCCATGGGTCTAGGAATGCATATGGGTGCAGCAGCAACACCAACATCA 
ATACCGCAATTCCTGCCTATGAATGTTCAAGCAACCGGTTTTCCGGGGATGAACAATGCA 
CCACCACAAATGCTAAGCTTTCTTAATCACCCAAGTGGACTAATTCCAAACACTCCTATC 
TTTTCTCCATTGGAAAATTGCTCTCAGCCATTCGTGGTGCCTTCGTGTGTTTCTCAGACT 
CAGGCTACTTCTTTTACTCAATTCCCAAAGTCTGCGTCCGCCTCAAACTTAGAAGATGCA 
ATGCAATATAGAGGAAGCAACGGTTTTAGTTATTATCGCTCGCGAAACTAATGATTTGTA 
GAAAGTTGATGTTTTCTCCAACTAACTAACTTTAAGCAAAAAAAAATGATCGTCTACTCT 
GTGTTGTTAGTCTATGGGCTTTTGGGCCTTGATTCTTGGAACGATTTGAACTTAATTCCA 
ACTATTTTCAAAGTGGATGTACAAAGTAAAA 

>G1649 Amino Acid Sequence (conserved domain in AA coordinates : 225-295 ) 
MEAKPLAS S S S E PNM ISPS SNI KPKL KDBD YMELVCENGQ I DAKIRRP KNNGS FQKQRRQ 
SLIjDLYETEYSEGFKKNIKILGDTQVVPVSQSKPQQDKETNEQMN^ 

NVSKSNKCVESSTJjIDVSAKGPKNVEVTTAPPDEQSAAVGRSTELYFASSSKFSRGTSRD 

lsccslkrkygdieeeestylsnnsddesddaktqvhartrkpvtkrkrstevhkly^ 

rrdefnkkmralqdllpncykddkasli^ 

ghyspmglgmhmgaaatptsipqflpmnv^^ 

fspiiencsqpfwpscvsqtqatsftqfpksasasnledamqyrgsngfsyyrspn* 

>G2131 (69.. 1010) 

GTCTCTCATTTTCATAATTCCATTTTCAGGATTGTCTCTCAATCTTTTATTCTTCTCATT 
CACCGGTAATGGCAAAAGTCTCTGGGAGGAGCAAGAAAACAATCGTTGACGATGAAATCA 
GCGATAAAACAGCGTCTGCGTCTGAGTCTGCGTCCATTGCCTTAACATCCAAACGCAAAC 
GTAAGTCGCCGCCTCGAAACGCTCCTCTTCAACGCAGCTCCCCTTACAGAGGCGTCACAA 
GGCATAGATGGACTGGGAGATACG7VAGCGCATTTGTGGGATAAGAACAGCTGGAACGATA 
CACAGACCAAGAAAGGACGTCAAGTTTATCTAGGGGCTTACGACGAAGAAGAAGCAGCAG 
CACGTGCCTACGACTTAGCAGCATTGAAGTACTGGGGACGAGACACACTCTTGAACTTCC 
CTTTGCCGAGTTATGACGAAGACGTCAAAGAAATGGAAGGCCAATCCAAGGAAGAGTATA 
TTGGATCATTGAGAAGAAAAAGTAGTGGATTTTCTCGCGGTGTATCAAAATACAGAGGCG 
TTGCAAGGCATCACCATAATGGGAGATGGGAAGCTAGAATTGGAAGGGTGTTTGGTAATA 
AATATCTATATCTTGGAACATACGCCACGCAAGAAGAAGCAGCAATCGCCTACGACATCG 
CGGCAATAGAGTACCGTGGACTTAACGCCGTTACCAATTTCGACGTCAGCCGTTATCTAA 
ACCCTAACGCCGCCGCGGATAAAGCCGATTCCGATTCTAAGCCCATTCGAAGCCCTAGTC 
GCGAGCCCGAATCGTCGGATGATAACAAATCTCCGAAATCAGAGGAAGTAATCGAACCAT 
CTACATCGCCGGAAGTGATTCCAACTCGCCGGAGCTTCCCCGACGATATCCAGACGTATT 
TTGGGTGTCAAGATTCCGGCAAGTTAGCGACTGAGGAAGACGTAATATTCGATTGTTTCA 
ATTCTTATATAAATCCTGGCTTCTATAACGAGTTTGATTATGGACCTTAATCGTATTTTC 
TACAAGTTTTGTTTTGATTATCTACACAATACATCAATATATTCT 

>G2131 Amino Acid Sequence (conserved domain in AA coordinates : 50-186 , 112-183) 
MAKVSGRSKIKTIVDDEISDKTASASESASIALTSKRK 
WTGRYEIAHLWDKNSWNDTQTKKGRQVYLGAYDEEEAAARAYDLAA^ 
SYDEDVTCEMEGQSKEEYIGSLRRKSSGFSRGVSKYRGVARIE^ 

YLGTYATQEEAAI AYD I AAIEYRGLNAVTNFDVSRYLNPNAAAJDKADSDS KP IRS P S RE P 
ESSDDNKSPKSEEVIEPSTSPEVIPTRRSFPDDIQTYFGCQDSGKIiATEEDVIFDCFNSY 
INPGFYNEFD YGP * 
>G215 (1..1110) 

ATGACTCGTCGGTGTTCGCATTGTAGGAACAATGGGCACAATTCACGCACGTGTCCAACG 
CGTGGGTCTGGTTCCTCCTCCGCCGTGAAGTTATTTGGTGTGAGGTTAACGGATGGCTCG 
ATTATTAAAAAGAGTGCGAGTATGGGTAATCTCTCGGCATTGGCTGTTGCGGCGGCGGCG 
GCftACGCACCACCGTTTATCTCCGTCGTCTCCTCTGGCGACGTCAAATCTTAATGATTCG 
CCGTTATCGGATCATGCCCGATACTCTAATTTGCATCATAATGAAGGGTATTTATCTGAT 
GATCCTGCTCATGGTTCTGGGTCTAGTCACCGTCGTGGTGAGAGGAAGAGAGGTGTTCCT 
TGGACTGAAGAGGAACATAGACTATTCTTAGTCGGTCTTCAGAAACTCGGGAAAGGAGAT 
TGGCGCGGTATTTCGAGAAACTATGTAACGTCAAGAACTCCTACACAAGTGGCTAGTCAT 
GCTCAAAAGTATTTTATTCGACATACTAGTTCAAGCCGCAGGAAAAGACGGTCTAGCCTC 
TTCGACATGGTTACAGATGAGATGGTAACCGATTCATCGCCAACACAGGAAGAGCAGACC 
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TTAAACGGTTCCTCTCCAAGCAAGGAACCTGAAAAGAAAAGCTACCTTCCTTCACTTGAG 
CTCTCACTCAATAATACCACAGAAGCTGAAGAGGTCGTAGCCACGGCGCCACGACAGGAA 
AAATCTCAAGAAGCTATAGAACCATCAAATGGTGTTTCACCAATGCTAGTCCCGGGTGGC 
TTCTTTCCTCCTTGTTTTCCAGTGACTTACACGATTTGGCTCCCTGCGTCACTTCACGGA 
ACAGAACATGCCTTAAACGCTGAGACTTCTTCTCAGCAGCATCAGGTCCTAAAACCAAAA 
CCTGGATTTGCTAAAGAACGTGTGAACATGGACGAGTTGGTCGGTATGTCTCAGCTTAGC 
ATAGGAATGGCGACAAGACACGAAACCGAAACTTCCCCTTCCCCGCTATCTTTGAGACTA 
GAGCCCTCAAGGCCATCAGCGTTTCACTCGAATGGCTCGGTTAATGGTGCAGATTTGAGT 
AAAGGCAACAGCG CGATTC AGG CTATCTAA 

>G215 Amino Acid Sequence (domain in AA coordinates: TBD) 

MTRRCSHCSNNGHNSRTCPTRGSGSSSAVKLFGVRLTDGSIIKKSASMGNIiSALAVAAAA 

ATHHRLSPSSPLATSNLNDSPLSDHARYSMLHHNEGYIiSDDPAHGSGSSHRRGERKRGVP 

WTEEEHIU.FLVGIjQKLGKGDWRGISRNYVTSRTPTQVASHAQKYFIRHTSSSRRKRRSSIi 

FDMVTDEMVTDSSPTQEEQTI^ 

KSQEAIEPSNGVSPMLVPGGFFPPCFPVTYTIWLPASLHGTEHALNAETSSQQHQVLKPK 

PGFAKERVNMDEIiVGMSQLSIGMATRHETETSPSPLSLRLEPSRPSAFHSNGSW 

KGNSAIQAI* 

>G1508' (1. .420) 

ATGCTAGATCACAGTGAAAAGGTCTTATTGGTTGATTCAGAAACCATGAAAACAAGAGCT 

GAAGATATGATCGAACAGAACAACACTAGTGTTAACGACAAGAAGAAGACTTGTGCTGAT 

TGTGGAACCAGTAAAACTCCTCTTTGGCGTGGTGGTCCTGTTGGTCCAAAGTCGTTGTGT 

AACGCGTGTGGGATCAGAAACAGAAAGAAGAGAAGAGGAGGAACAGAAGATAATAAGAAA 

TTAAAGAAATCGAGTTCTGGCGGCGGAAACCGTAAATTTGGTGAATCGTTAAAACAGAGT 

TTGATGGATTTGGGGATAAGGAAGAGATCAACGGTGGAGAAGCAACGACAGAAGCTTGGT 

GAAGAAGAACAAGCCGCTGTGTTACTCATGGCTCTTTCTTATGGCTCTGTTTACGCTTAG 

>G1508 Amino Acid Sequence (domain in AA coordinates: 38-63) 

^DHSEKWbliVDSETMKTRAEDMIEQm^ 

NACGIRNRKKRRGGTEDNKKLKKSSSG 

EEEQAAVLLMALSYGSVYA* 

>G2110 (36. .1622) 

GAGAGCTAATAAAAAATTTATCAAAGAAGACTAATATGGAGAAGGACGATTTCTTGAGGA 

GTGGTCATGGAAGAGAAGAAAGCCATGATGAGATGAGAAAACTTGATTCATCTCACGATG 

ATTCTCATCAAGAACACGACCATATTATAAGATCCAAGTTGGACTCAACTAAAGTCGAAA 

TGGATGAGGCTAAAGAGGAAAATCGAAGACTAAAGTCATCATTGAGTAAAATCAAGAAAG 

ATTTTGACATCCTTCAAACACAATACAACCAATTAATGGCCAAACATAACGAACCAACCA 

AGTTCCAATCAAAAGGGCATCATCAAGACAAAGGCGAAGATGAAGACAGAGAAAAAGTTA 

ACGAACGTGAAGAACTTGTCTCGTTGAGCCTAGGCAGACGGTTAAATTCAGAGGTTCCAA 

GTGGTTCGAATAAAGAAGAAAAAAATAAAGATGTTGAAGAAGCGGAAGGTGACAGAAATT 

ATGATGATAATGAAAAAAGCAGTATTCAAGGGTTGAGTATGGGGATTGAATACAAGGCTT 

TGAGTAATCCTAATGAGAAGTTAGAGATTGATCATAATCAAGAAACCATGTCGTTGGAGA 

TTAGTAACAATAATAAGATCAGATCACAAAATAGTTTTGGGTTTAAGAATGATGGAGATG 

ATCATGAAGATGAAGATGAGATTTTGCCTCAAAACCTTGTTAAGAAAACTAGGGTTTCGG 

TGAGATCAAGATGTGAGACACCAACGATGAACGACGGATGTCAATGGAGGAAATATGGCC 

AGAAAATAGCTAAAGGCAATCCATGTCCCCGAGCTTACTATCGTTGCACCATTGCAGCTT 

CTTGTCCAGTAAGAAAACAGGTGCAAAGATGTTCAGAAGATATGTCTATACTTATCTCAA 

CGTACGAAGGAACAGATAACCATCCACTTCCCATGTCAGCAACTGCCATGGCCTCTGCCA 

CTTCCGCTGCCGCCTCCATGCTTCTCTCCGGCGCCTCCTCCTCCTCATCCGCCGCAGCTG 

ATCTTCATGGCCTTAACTTCTCTCTTTCCGGCAACAACATCACTCCAAAACCTAAAACTC 

ATTTCCTCCAATCCCCTTCTTCTTCTGGCCATCCGACCGTCACTCTCGACCTCACAACCT 

CCTCCTCGTCGCAGCAACCGTTCTTATCAATGCTCAATAGATTCAGCTCTCCTCCAAGTA 

ATGTCTCACGATCTAATAGTTATCCTTCAACCAATCTCAACTTTTCAAACAACACCAACA 

CATTGATGAATTGGGGTGGTGGTGGTAATCCCAGTGATCAATACCGTGCAGCTTACGGCA 

ACATTAACACCCATGAGCAATCACCTTACCACAAAATCATTCAAACCCGAACCGCCGGGT 

CATCTTTCGATCCGTTTGGAAGATCATCTTCATCACATTCTCCACAAATAAATCTTGATC 

ATATCGGAATCAAGAACATCATCAGTCACCAAGTGCCATCTTTACCGGCTGAAACAATCA 

AGGCAATCACGACAGATCCAAGTTTCCAATCGGCTTTGGCGACAGCTCTATCTTCCATCA 

TGGGCGGCGATTTAAAGATTGATCACAATGTGACTAGAAATGAAGCTGAGAAGAGCCCTT 



274 



WO 03/013227 



275/286 



PCT/US02/25805 



AAAGAGAATTGTTATATATATGTTCTTATATACTCAGTACATTGGTAAATGGGTTTAGAC 
TTTCACTAGTTTCCTAGTTCATCTATATATTGGTTO 

TTGGAGTTTATGGAACTAATGTGTACATATGAAACTTTAGAACGAATAAATAAAACTTGG 
AATTCCTTTTTAAAAAAAAAAAAAAAAA 

>G2110 Amino Acid Sequence (conserved domain in AA coordinates :239-298) 
MEKDDFLRSGHGREESHDEMRKLDSSHDDSHQEHDHIIRSK^ 

SSLSKIKKDFDILQTQYNQLMAKEMEPTKFQSKGHHQDKGEDEDREKVNEREELVSLSLG 
RRLNSEVPSGSNKEEKNKDVEEAEGDRNYDDNEKSSIQGLSMGIEYKAIiSNPNEKLEIDH 
NQETMSLEISNl^KIRSQNSFGFKiroGDDHEDEDEI^ 

GCQWRKYGQKI AKGNPCPRAYYRCTI AAS CPVRKQVQRCS EDMS I L I S TYEGTHNHPLPM 
SATAMAS ATSAAAS MLLSGASS S SS AAADLHGLNFSLSGNNI TPKPKTHFLQS PS S SGHP 
TVTLDLTTSSSSQQPFLS^IS^ 

DQ YRAAYGNINTHQQS P YHKI I QTRTAGS S FDPFGRS S S SHS PQ INUDH I G I KN 1 1 SHQV 

PSLPAETIKAITTDPSFQSAXiATALSSIMGGDLKIDHNVTRNEAEKSP* 

>G2442 (71.. 997) 

TCGACCAATTTAGACCATTCCAAATTCGTCGTCCTTTTCTCTGTGTAGTCTAATTATATA 
. TTACAAGTAGATGAATTGGTTACCTGAAGCTGAAGCTGAGGAGCACTTGAAAGGTATTCT 
CTCTGGTGATTTCTTTGATGGTCTCACCAATCACCTTGATTGCCCACTTGAAGACATCGA 
TTCCACCAATGGTGAGGGAGATTGGGTCGCCAGGTTTCAAGACCTTGAGCCTCCTCCCTT 
GGATATGTTCCCTGCTTTGCCTTCTGACCTCACCTCTTGTCCCAAGGGCGCCGCTCGTGT 
GCGGATTCCCAACAACATGATTCCTGCTTTGAAGCAiSTCCTGTTCTTCTGAAGCCTTGTC 
CGGCATTAATAGCACTCCCCACCAATCTTCAGCTCCTCCTGATATCAAAGTTTCATATCT 
ATTTCAGTCTCTAACTCCAGTGTCAGTTCTCGAGAACAGTTATGGTTCTCTCTCCACCCA 
AAACTCCGGATCTCAGAGATTGGCTTTCCCTGTGAAAGGCATGAGAAGCAAGCGCAGACG 
CCCCACAACAGTGAGACTTAGCTACCTTTTCCCCTTTGAACCCAGAAAGTCAACTCCGGG 
TGAATCAGTAACCGAGGGTTACTAT.TCTTCTGAGCAACATGCCAAGAAGAAGCGCAAGAT 
TCATCTGATCACCCACACCGAGTCTTCCACTTTGGAGTCAAGTAAGTCGGATGGGATAGT 
CCGGATATGCACTCATTGTGAGACAATCACGACCCCACAGTGGAGGCAAGGACCCAGTGG 
ACCCAAGACCCTCTGCAACGCTTGCGGAGTCCGGTTCAAATCTGGTCGCCTAGTTCCAGA 
ATACCGGCCAGCCTCAAGCCCGACCTTCATCCCATCTGTGCATTCAAACTCACACAGGAA 
GATCATTGAGATGAGAAAGAAGGACGACGAGTTTGATACCAGCATGATTCGCAGTGATAT 
CCAGAAGGTAAAGCAGGGGAGGAAGAAAATGGTATAAAAGTA 

>G2442 Amino Acid Sequence (domain in aa coordinates: 220-246) 

MNWLPEAEAEEHLKGILSGDFFDGLTNHLDCPLEDIDSTNGEGDWARFQDLEPPPLDMF 

PALPSDLTSCPKGAARVRIPl^IPALKQSCSSEALSGINSTPHQSSAPPDIKVSYLFQS 

LTPVSVLENSYGSLSTQNSGSQRLAFPVKGMRSKRRRPTTVRLSYLFPFEPRKSTPGESV 

TEGYYSSEQHAKKKRKIHLITHTESSTLESSKSDGIVRICTHCETITTPQWRQGPSGPKT 

LOTACGTOFKSGRLVPEYRPASSPTFIPSVHSNSHRKIIEMRKKDDEFDTSMIRSDIQKV 
KQGRKKMV* 

>G1051 (66. .1031) 

CCTGTAAATTCAGATTTGCTTTCTTTGGTAATCTTTTGGATCAAGATCCATCTATTTTTT 

CTTCAATGGCACAACTCCCTCCTAAAATCCCCAACATGACACAACATTGGCCTGATTTCT 

CTTCCCAAAAGCTCTCTCCTTTCTCTACCCCAACCGCAACCGCTGTCGCCACCGCTACAA 

CCACCGTACAAAACCCCTCATGGGTCGACGAATTCCTCGACTTCTCAGCGTCTCGCCGTG 

GCAACCACCGTCGTTCCATCAGGGACTCTATCGCATTCCTCGAAGCTCCAACAGTCAGCA 

TCGAAGACCACCAATTCGACAGGTTCGATGACGAACAGTTCATGTCGATGTTCACCGACG 

ACGACAACCTTCATAGCAATCCTTCCCATATCAA(^CAAAAATAACAATGTGGGGCCCA 

CGGGATCTTCCTCGAACACATCCACGCCGTCCAATAGCTTCAACGACGATAACAAAGAAT 

TACCACCGTCCGATCATAACATGAACAATAATATCAACAACAACTATAACGATGAAGTCC 

AAAGCCAATGCAAGATGGAGCCAGAAGATGGTACGGCGTCGAATAACAATTCCGGTGATA 

GCTCCGGCAACCGGATTCTCGATCCCAAAAGGGTTAAGAGAATATTAGCAAATCGGCAAT 

CAGCACAGAGATCAAGGGTGAGGAAACTGCAATACATATCAGAGCTCGAACGTAGCGTCA 

CTTCGTTGCAGGCGGAAGTGTCAGTGTTATCGCCAAGAGTTGCATTCTTGGATCATCAAC 

GTTTGCTTCTTAACGTTGACAACAGCGCTCTCAAGCAACGAATCGCTGCTTTATCTCAAG 

ACAAGCTTTTCAAAGACGCACATCAAGAAGCATTGAAGAGAGAAATAGAGAGACTTCGAC 

AAGTGTATAATCAACAAAGCCTCACGAATGTGGAAAATGCAAATCATTTATCGGCGACCG 

GAGCCGGTGCTACTCCGGCCGTCGACATCAAGTCGTCCGTTGAAACAGAGCAGCTCCTCA 



275 



r 



WO 03/013227 



PCT/US02/25805 



276/286 



ATGTCTCATAAATTAACCATCATGCAT 

CAAAAGTTCTTGACTATAAAATCTCTTTCGGGTAAGAAATTCAGGAGATATACATTTTTT 
ATTCTAATCACATTGTTTTTAAGTTGTGATGAATTCAGTTTGATGTATCTTATTTATTTT 
GTTTATGTCGTCTTTTTTTCTTGGGGTTGATGGAAGGGAATCATCAATTGTTGTTTGTAC 

>G1051 Amino Acid Sequence (domain in AA coordinates 189-250) 
MAQLPPKIPl^TQHWPDPSSQKLSPFSTPTATAVATATTTVQNPSWVDEFLDFSASRRGN 
HRRS I SDS IAFLEAPTVS I EDHQFDRFDDEQFMSMFTDDDNLHSNPSH INNKNNNVGPTG 
S S SNTSTPSNS FNDDNKELPPSDHNMl«m 

GNRILDPKRVKRILANRQSAQRSRVRKLQYISELERSVTSLQAEVS^^ 

LLimiNSALKQRIAALSQDKLFKDAHQEALKREIERLRQVYNQQSLTNVENA 

GATPAVDIKSS VETEQLLNVS * 

>G1052 (138. .1127) 

TGATCATCTAAAACTTTCAATTTCTCTCTT 

TCAAATCTTTGATCCTTTCCTTTGTTTTTCATTTGACCTCTTACAAAAAAATCTGGTGTG 

CCATTAAATCTTTATTAATGGCACAACTTCCTCCGAAAATCCCAACCATGACGACGCCAA 

ATTGGCCTGACTTCTCCTCCCAGAAACTCCCTTCCATAGCCGCAACGGCGGCAGCCGCAG 

CAACCGCTGGACCTCAACAACAAAACCCTTCATGGATGGATGAGTTTCTCGACTTCTCAG 

CGACTCGCCGTGGGACTCACCGTCGTTCTATAAGCGACTCCATTGCTTTCCTTGAACCAC 

CTTCCTCCGGCGTCGGAAACCACCACTTCGATAGGTTTGACGACGAGCAATTCATGTCCA 

TGTTCAACGACGACGTACACAACAATAACCACAATCATCATCATCATCACAGCATCAACG 

GCAATGTGGGTCCCACGCGTTCATCCTCCAACACCTCCACGCCGTCCGATCATAATAGCC 

TTAGCGACGACGACAACAACAAAGAAGCACCACCGTCCGATCATGATCATCACATGGACA 

ATAATGTAGCCAATCAAAACAACGCCGCCGGTAACAATTACAACGAATCAGACGAGGTCC 

AAAGCCAGTGCAAGACGGAGCCACAAGATGGTCCGTCGGCGAATCAAAACTCCGGTGGAA 

GCTCCGGTAATCGTATTCACGACCCTAAAAGGGTAAAAAGAATTTTAGCAAATAGGCAAT 

CAGCACAGAGATCAAGGGTGAGGAAATTGCAATACATATCAGAGCTTGAAAGGAGCGTTA 

CTTCATTGCAGACTGAAGTGTCAGTGTTATCGCCAAGAGTTGCGTTTTTGGATCATCAGC 

GATTGCTTCTCAACGTCGACAATAGTGCTATCAAGCAACGAATCGCAGCTTTAGCACAAG 

ATAAGATTTTCAAAGACGCTCATCAAGAAGCATTGAAGAGAGAAATAGAGAGACTTCGAC 

AAGTATATCATCAACAAAGCCTCAAGAAGATGGAGAATAATGTCTC CGATCAATCTCCGG 

CCGATATCAAACCGTCCGTTGAGAAGGAACAGCTCCTCAATGTCTAAAGCTGTTCGTTCA 

CTAAGATCTTTCTTTTCATGGCGAAAAGATTCTTGACTATAAAACCTCTTTGTGTCAAGA 

AATTAATTTATCAAAGAAGATGGCCTTTTTTATTTGATCTAATCACATTTTTTTAAGTTG 

TGATGAATTTGCTTTTGATGTATCTGTXTTTTTTTTTTTTTTTT 

>G1052 Amino Acid Sequence (domain in AA coordinates 2 01-261) 

MAQLPPKIPTMTTPNWPDFSSQKLPSIAATAAAAATAGPQQQNPSWMDEFLDFSATRRGT 
HRRS I SDSXAFLEPPS SGVGNHHFDRFDDEQFMSMFNDDVHN^^ 
RSSSOTSTPSDHNSLSDDDNNKEAPPSDHDHHMDNWANQ^ 

EPQDGPSANQNSGGSSGNRIHDPKRVKRILANRQSAQR'SRVRKLQYISELERSVTSLQTE - « 

VSVLSPRVAFLDHQRLLLNVDNSAIKQRIAALAQDKIFKDAHQEALKREIERLRQVYHQQ 

SLKKMENNVSDQSPADIKPSVEKEQLLNV* 

>G1079 (1..1995) 

ATGGGTTGTGCTGCTTCAAGAATTGATAATGAAGAAAAGGTTTTAGTGTGTAGGCAGAGA 
AAGAGGCTAATGAAAAAGTTATTAGGGTTCAGGGGAGAATTTGCAGATGCACAGTTGGCT 
TATCTTAGAGCTTTGAGGAACACTGGTGTTACTCTTAGGCAATTCACTGAGTCTGAGACC 
TTGGAGCTTGAAAACACTAGTTATGGTTTAAGTTTGCCTTTGCCTCCTTCGCCTCCTCCT 
ACATTGCCTCCTTCACCTCCACCACCTCCTCCATTTAGCCCGGATTTGAGAAATCCTGAG 
ACTAGTCATGACTTGGCTGATGAGGAGGAAGAGGGTGAAAATGATGGTGGTAATGATGGA 
AGTGGTGCAGCTCCTeCGCCTCCATTGCCGAATTCTTGGAACATTTGGAACCCTTTTGAG 
TCACTTGAGCTGCATAGTCATCCAAATGGTGACAATGTAGTTACACAAGTTGAACTGAAG 
AAGAAACAACAAATTCAGCAAGCTGAAGAGGAAGATTGGGCGGAGACGAAGTCTCAATTT 
GAGGAAGAAGATGAGCAACAAGAAGCAGGAGGTACTTGCCTTGATTTGAGTGTTCATCAA 
ATAGAGGCTGTTAGTGGCTGTAACATGAAGAAGCCACGTCGTCTGAAGTTTAAGCTGGGA 
GAAGTTATGGACGGTAACTCATCTATGACAAGCTGCTCCGGTAAAGATCTTGAGAAAACT 
CATGTGACTGATTGTAGAATCAGGAGGACCTTAGAAGGAATCATCAGAGAGTTGGATGAT 
TATTTTCTTAAAGCATCGGGTTGCGAGAAGGAGATAGCTGTGATAGTAGACATCAACAGT 
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AGGGATACTGTTGATCCTTTCAGGTACCAGGAAACAAGAAGGAAGAGAAGCAGCTCGGCA 

AAGGTATTCAGTGCATTGTCATGGAGTTGGTCTTCAAAGTCTCTTCAGTTGGGCAAAGAT 

GCTACAACAAGCGGGACTGTTGAACCCTGTAGGCCTGGAGCTCACTGCAGCACACTTGAG 

AAGCTATACACAGCTGAGAAGAAACTTTACCAGCTAGTCAGAAACAAAGAGATTGCCAAA 

GTGGAGCATGAGAGGAAGTCTGCATTACTGCAAAAGCAAGATGGGGAAACCTATGATTTG 

AGCAAAATGGAGAAAGCACGCTTGTCTTTGGAGAGTTTGGAAACCGAGATACAGCGTCTA 

GT^GATTCCATAACTACAACACGCTCATGTTTGCTTAACTTGATCAATGATGAGCTGTAT 

CCGCAGCTAGTTGCTTTAACTTCAGGGCTAGCACAGATGTGGAAAACAATGCTCAAGTGT 

GATCAAGTTCAAATTGATATATCCCAGCAACTGAACGATCTTCCGGATTACCCGAGTATA 

GATCTCAGTTCGGAATACAAACGCCAGGCGGTTAATGAACTAGAGACCGAGGTTACTTGC 

TGGTACAATAGCTTTTGCAAGTTAGTAAATTCCCAGCGAGAATACGTGAAAACACTCTGT 

ACGTGGATCCAACTTACTGATCGCCTCTCTAACGAAGACAACCAAAGAAGTAGCTTGCCT 

GTTGCTGCTCGTAAGCTCTGCAAAGAGTGGCAGCTTGAATACAACCTGCGTAGGAAATGC 

AATAAACTTGAGAGGAGGCTTGAGAAAGAGCTAATTTCACTGGCTGAGATTGAAAGAAGG 

CTCGAGGGGATTTTAGCAATGGAAGAGGAGGAAGTAAGCTCAACGAGTTTGGGCTCTAAG 

CATCCGTTGTCAATGAAACAAGCCAAGATCGAAGCCTTGAGAAAACGAGTGGATATTGAG 

AAAACTAAGTACTTAAACTCGGTCGAGGTTAGTAAGAGAATGACACTAGACAACCTCAAA 

TCAAGCCTTCCCAATGTCTTTCAGATGTTGACTGCTCTAGCTAATGTCTTTGCCAATGGG 

TTTGAATCCGTTAATGGCCAAACCGGTACAGATGTTTCCGACACATCCCAACATTCCGAT 
GAATCTCAACCCTAA 

>G1079 Amino Acid Sequence (conserved domain in AA coordinates : 1-50) 
MGCAASRIDNTSEKVLVCnilQRKRL 

IiELENTSYGLSLPLPPSPPPTLPPSPPPPPPFSPDLRNPETSHDIiADEEEEGENDGGNDG 
SGAAPPPPLPNSWNIWNPFESLEIiHSHPNGDNVVTQVELKKKQQIQQAEEEDWAETKSQP 
EEEDEQQEAGGTCLDLSVHQIEAVSGCNMKKPRRLKFKIj^^ 

HV^DCRIRRTLEGIIRELDDYFLKASGCEKEIAVIVDXNSRDTVDPFRYQETRRKRSSSA 
KVFSALSWSWSSKSLQLGKDATTSGTVEPCRPGAHCSTLE^ 
V^HERKSAIiLQKQDGETYDLSKMEKARLSLESLETEIQRLEDSITTTRSCLL^ 
PQLVALTSGLAQMWKTMIiKCHQVQIHI SQQLNHLPDYPS IDLSSEYKRQAVNELETEVTC 
WYNSFCKIA/TtfSQREYVKTLCTWIQLTD^ 

NKLERRLEKELISLAEIERRLEGILAMEEEEVSSTSLGSKHPLSIKQAKIEAIiRKRVDIE 

KTKYLNSVEVSKRMTLDNLKSSLPl^ 

ESQP* ' . 

>G1335 (56.. 667) 

TTTTTTTTTAAAAGATTTAGAGAGAAAAGTGAGTTATTAAGAGATTCCAATCAAAATGAG 

CGGAGACAACGGCGGTGGTGAGAGGCGCAAAGGCTCCGTCAAGTGGTTTGATACCCAGAA 

GGGTTTCGGCTTCATCACTCCTGACGACGGTGGCGACGATCTCTTCGTTCACCAGTCCTC 

CATCAGATCTGAGGGTTTCCGTAGCCTCGCTGCCGAAGAAGCCGTAGAGTTCGAGGTTGA 

GATCGACAACAACAACCGTCCCAAGGCCATCGATGTTTCTGGACCCGACGGCGCTCCCGT 

CCAAGGAAACAGCGGTGGTGGTTCATCTGGCGGACGCGGCGGTTTCGGTGGAGGAAGAGG 

AGGTGGACGCGGATCTGGAGGTGGATACGGCGGTGGCGGTGGTGGATACGGAGGAAGAGG 

AGGTGGTGGTCGAGGAGGCAGCGACTGCTACAAGTGTGGTGAGCCCGGTCACATGGCGAG ' 

AGACTGTTCTGAAGGCGGTGGAGGTTACGGAGGAGGCGGCGGTGGCTACGGAGGTGGAGG 

CGGATACGGCGGAGGAGGTGGTGGTTACGGAGGTGGTGGCCGTGGAGGTGGTGGCGGCGG 

GGGAAGCTGCTACAGCTGTGGCGAGTCGGGACATTTCGCCAGGGATTGCACCAGCGGTGG 

ACGTTAAAACCAACGCCGGTTACGCGGTGGAGAAGAGTGAGTTGGTTATCTCACAAGTGA 

TCGGTTCTTTCTCCCGCCGCCTTCTATCTCTCTATTATCCACTTTTTGCTTATTATGATG 

GATCTCTATCTTTGTTAGTTGGTTTTTTCTTGATGGTTTCGGATTAGGACTCXTCTTTTG 

GTTTTGCTACTTATGGTTGGTTTTATTTATGGTACTTGTGATATGGGTGAAATGCTCTAC 

TTGTTGCTCTGTTTCAAGTGTTCATAATATGCGAACAAATATTCTGGGTTTTGTTTCAAA 
AAAAA 

>G1335 Amino Acid Sequence (domain" in AA coordinates: 24-43 131-144 185-203) 

MSGDNGGGERRKGSVKWFDTQKGFGFITPDDGGDDLFVHQSSIRSEGFRSLAAEEAVEFE 

VEIDNNWRPKAIDVSGPDGAPVQGNSGGGSSGGRGGFGGGRGGGRGSGGGYGGGGGGYGG 

RGGGGRGGSDCYKCGEPGHMARDCSEGGGGYGGGGGGYGGGGGYGGGGGGYGGGGRGGGG 
GGGSCYSCGESGHFARDCTSGGR* 

>G157 (31. .621) 
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GGGCATAACCCTTATCGGAGATTTGAAGCCATGGGAAGAAGAAAAATCGAGATCAAGCGA 

ATCGAGAACAAAAGCAGTCGACAAGTCACTTTCTCCAAACGACGCAATGGTCTCATCGAC 

AAAGCTCGACAACTTTCGATTCTCTGTGAATCCTCCGTCGCTGTTGTCGTCGTATCTGCC 

TCCGGAAAACTCTATGACTCTTCCTCCGGTGACGACATTTCCAAGATCATTGATCGTTAT 

GAAATACAACATGCTGATGAACTTAGAGCCTTAGATCTTGAAGAAAAAATTCAGAATTAT 

CTTCCACACAAGGAGTTACTAGAAACAGTCCAAAGCAAGCTTGAAGAACCAAATGTCGAT 

AATGTAAGTGTAGATTCTCTAATTTCTCTGGAGGAACAACTTGAGACTGCTCTGTCCGTA 

AGTAGAGCTAGGAAGGCAGAACTGATGATGGAGTATATCGAGTCCCTTAAAGA7\AAGGAG 

AAATTGCTGAGAGAAGAGAACCAGGTTCTGGCTAGCCAGATGGGAAAGAATACGTTGCTG 

GCAACAGATGATGAGAGAGGAATGTTTCCGGGAAGTAGCTCCGGCAACAAAATACCGGAG 

ACTCTCCCGCTGCTCAATTAGCCACCATCATCAACGGCTGAGTTTTCACCTTAAACTCAA 

AGCCTGATTCATAATTAAGAGAATAAATTTGTATATTATAAAAAGCTGTGTAATCTCAAA 

CCTTTTATCTTCCTCTAGTGTGGAATTTAAGGTCAAAAAGAAAACGAGAAAGTATGGATC 

AGTGTTGTACCTCCTTCGGAGACAAGATCAGAGTTTGTGTGTTTGTGTCTGAATGTACGG 

ATTGGATTTTTAAAGTTGTGCTTTCTTTCTTCAAAAAAAAAAA 

>G157 Amino Acid Sequence (domain in AA coordinates: 2-57) 

MGRRKIEIKRIENKSSRQVTFSKRRNGLIDKARQLSILCESSVAVVVVSASGKLYDSSSG 

DDISKIIDRYEIQHADELRAIJDLEEKIQNYLPHKELLETVQSKIiEEPNVDNVSVDSLI 

EEQLETALSVSRARKAELMMEYIESLKEKEKLLREENQV 

GSSSGNKIPETLPLLN* 

>G1895 (1..954) 

ATGAATAACCAATCTGTTACTGACAATACAAGTCTTAAGCTGTC^TCTAATCTTAACAAC 

GAGTCAAAAGAAACATCTGAGAACAGTGATGACCAACACAGCGAGATCACAACAATTACA 

TCGGAAGAAGAGAAAACAACTGAACTGAAGAAACCAGACAAGATTCTTCCATGTCCGAGA 

TGCAACAGCGCAGACACCAAATTCTGTTACTACAACAACTACAACGTTAACCAGCCACGT 

CACTTCTGTAGAAAATGCCAGAGGTATTGGACCGCTGGTGGATCCATGAGGATCGTCCCG 

GTTGGCTCAGGCCGTCGCAAGAACAAGGGATGGGTTTCTTCAGACCAGTACCTGCACATC 

ACTTCCGAGGATACTGACAATTACAATAGCTCCTCAACAAAGATTCTAAGCTTCGAGTCT 

TCGGACTCTTTGGTAACTGAGAGGCCTAAGCATCAATCAAACGAAGTGAAGATAAACGCT 

GAACCTGTTTCACAAGAACCCAACAACTTCCAAGGGTTACTTCCTCCCCAAGCATCCCCT 

GTTTCGCCTCCTTGGCCTTACCAATACCCTCCAAACCCTAGTTTCTACCACATGCCCGTC 

TACTGGGGCTGCGCGATACCGGTTTGGTCTACCCTCGACACTTCTACATGTCTTGGGAAA 

AGGACAAGAGACGAAACTTCTCATGAAACTGTTAAAGAGAGTAAAAATGCTTTTGAGAGA 

ACAAGCTTGCTTTTGGAATCTCAGAGCATCAAAAATGAAACAAGTATGGCTACAAATAAC 

CATGTGTGGTATCCAGTACCGATGACCCGCGAGAAGACACAAGAATTCAGCTTTTTCAGT 

AATGGAGCTGAAACAAAGAGCAGO^ACAACAGATTCGTCCCTGAAACGTATCTTAACCTG 

CAAGCAAACCCTGCAGCCATGGCAAGATCTATGAACTTCAGAGAGAGCATATAA 

>G1895 Amino Acid Sequence (domain in AA coordinates: 55-110) 

Ml^QSVTDNTSLKLSSNI^ 

CNSADTKFCYYTvTNYNV^ 

TSEDTDimrSSSTKILSFESSDSLWERPKHQSNEVKINAEPVSQEPNNFQGLLPPQASP 

VSPPWPYQYPPNPSFYHMPVYWGCAIPVWSTIJ)TSTCLGK^TRDETSHETVKESKNAFER 
TSLLLESQSIKiraTSMATN^^ 

QANPAAMARSMNFRESI * 

>G1900 (1. . 897) 

ATGCTGGAAACTAAAGATCCTGCGATAAAGCTCTTTGGTATGAAAATTCCTTTCCCGACG 
GTTTTAGAGGTTGCTGATGAAGAAGAAGAAAAGAACCAAAACAAGACATTAACTGATCAA 
TCGGAGAAAGACAAAACCCTAAAGAAACCAACCAAGATTCTTCCATGTCCAAGATGCAAC 
AGCATGGAGACTAAGTTCTGTTACTACAACAACTACAACGTAAACCAACCTCGCCATTTT 
TGTAAAGCTTGTCAGAGATATTGGACCTCAGGTGGGACCATGAGAAGTGTTCCAATCGGA 
GCAGGACGGCGCAAGAACAAGAACAACTCACCAACTTCACATTACCACCATGTGACTATC 
TCCGAAACAAATGGTCCGGTCCTTAGTTTCAGCCTCGGAGATGATCAAAAGGTCTCGAGT 
AATAGGTTTGGTAATCAAAAGCTAGTTGCTAGGATAGAGAACAATGACGAGCGCTCTAAT 
AACAACACTTCGAACGGTTTGAATTGTTTTCCGGGAGTTTCGTGGCCGTACACGTGGAAT 
CCTGCGTTTTACCCGGTTTACCCTTATTGGAGCATGCCAGTGTTGTCTTCTCCGGTAAGT 
TCAAGTCCTACTTCTACTCTTGGTAAGCATTCGAGAGACGAAGACGAGACGGTGAAGCAA 
AAACAGAGGAATGGATCTGTATTGGTTCCAAAGACTTTGAGAATTGATGATCCTAATGAA 
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GCTGCAAAGAGTTCGATATGGACAACACTTGGGATCAAGAACGAAGTTATGTTCAATGGG 
TTTGGTTCGAAGAAAGAGGTTAAGCTCAGTAACAAAGAAGAAACAGAGACCTCACTTGTT 
CTTTGTGCAAACCCTGCTGCGTTATCAAGATCAATCAATTTCCATGAGCAGATGTGA 
>G19O0 Amino Acid Sequence (domain in AA coordinates: 54-106) 

MLETKDPAIKLFGMKIPFPTVLEVADEEEEKNQNKTLTDQSEKDKTLKXPTKILPCPRCN 
SMETKFCYYNimJVNQPJOT 

SETNGPVLSFSLGDDQKVSSiraFGNQKLVARIEl^ 

PAFYPVYPYWSMPVLSSPVSSSPTSTLGKHSRDEDETVKQKQRNGSVLVPKTLRIDDPNE 

AAKSSIWTTLGXKNEVMFNGFGSKKEVTQjSNKEETETSLVLC^^ 

>G2007 (1..861) 

ATGGGAAGGCAGCCATGTTGTGACAAGCTCATGGTGAAGAAGGGGCCGTGGACGGCGGAG 

GAAGACAAGAAACTGATAAACTTTATCTTGACCAACGGCCACTGTTGCTGGAGGGCTTTG 

CCGAAGCTGGCCGGTCTCCGTCGCTGTGGGAAGAGCTGCCGTCTACGGTGGACCAATTAT 

CTCCGACCTGACTTGAAGAGAGGTCTTCTCTCCGACGCCGAGGAACAGCTTGTCATCGAC 

CTTCATGCTCTTCTCGGCAACAGATGGTCCAAGATCGCTGCAAGATTACCAGGAAGAACA 

GACAACGAAATAAAAAATCATTGGAATACTCATATCAAGAAGAAGCTCCTTAAGATGGAA 
ATCGATCCTTCGACCC^TCAACCTCT^ 

AAATCTGAAACTTCATCGAAAGCCGACAATGTAAATGATAATAAAATCGTAGAGATCGAT 

GGGACAACGACAAATACAATAGATGATAGCATTATCACTCATCAAAATAGTTCAAATGAT 

GATTATGAATTACTTGGTGATATAATTCATAATTATGGAGATTTATTTAATATTCTATGG 

ACCAACGATGAACCTCCTCTAGTCGATGATGCATCATGGAGCAATCATAACGTTGGTATT 

GGAGGAACAGCTGCAGTTGCAGCCTCAGACAAGAACAACACTGCTGCCGAGGAAGATTTC 

CCGGAAAGATCATTTGAAAAACAGAACGGCGAAAGTTGGATGTTCTTGGATTATTGCCAA 

GAATTTGGTGTTGAAGATTTTGGGTTCGAGTGTTACCATGGTTTTGGTCAAAGCTCCATG 
AAGACGGGTCACAAGGACTAG 

>G2007 Amino Acid Sequence (domain in AA coordinates: TBD) 

MGRQPCCDKLIWKKGPWTAEEDKKLINFILTNGH 

LRPDLKRGLLSDAEEQLVIDLHALLGNRWSKIAAI^^ 

IDPSTHQPLNKVFTDTNIiVD^ 

DYELLGDIIHireGDLFNILWlTJDEPPL^ 

PERS FEKQNGES WMFLD YCQEFGVEDFGFEC YHGFGQS SMKTGHKD * 
>G214 (238. .2064) 

TGAGATTTCTCCATTTCCGTAGCTTCTGGTCTCTTTTCTTTGTTTCATTGATCAAAAGCA 
AATCACTTCTTCTTCTTCTTCTTCTCGATTTCTTACTGTTTTCTTATCCAACGAAATCTG 
GAATTAAAAATGGAATCTTTATCGAATCCAAGCTGATTTTGTTTCTTTCATTGAATCATC 
TCTCTAAAGTGGAATTTTGTAAAGAGAAGATCTGAAGTTGTGTAGAGGAGCTTAGTGATG 
GAGACAAATTCGTCTGGAGAAGATCTGGTTATTAAGACTCGGAAGCCATATACGATAACA 
AAGCAACGTGAAAGGTGGACTGAGGAAGAACATAATAGATTCATTGAAGCTTTGAGGCTT 
TATGGTAGAGCATGGCAGAAGATTGAAGAAC^TC^ 

AGAAGTCACGCTCAGAAATTTTTCTCCAAGGTAGAGAAAGAGGCTGAAGCTAAAGGTGTA 

GCTATGGGTCAAGCGCTAGACATAGCTATTCCTCCTCCACGGCCTAAGCGTAAACCAAAC 

AATCCTTATCCTCGAAAGACGGGAAGTGGAACGATCCTTATGTCAAAAACGGGTGTGAAT 

GATGGAAAAGAGTCCCTTGGATCAGAAAAAGTGTCGCATCCTGAGATGGCCIAATGAAGAT 

CGACAACAATCAAAGCCTGAAGAGAAAACTCTGCAGGAAGACAACTGTTCAGATTGTTTC 

ACTCATCAGTATCTCTCTGCTGCATCCTCCATGAATAAAAGTTGTATAGAGACATCAAAC 

GCAAGCACTTTCCGCGAGTTCTTGCCTTCACGGGAAGAGGGAAGTCAGAATAACAGGGTA 

AGAAAGGAGTCAAACTCAGATTTGAATGCAAAATCTCTGGAAAACGGTAATGAGCAAGGA 

CCTCAGACTTATCCeATGCATATCCCTGTGCTAGTGCCATTGGGGAGCTCAATAACAAGT 

TCTCTATCACATCCTCCTTCAGAGCCAGATAGTCATCCCCACACAGTTGCAGGAGATTAT 

CAGTCGTTTCCTAATCATATAATGTCAACCCTTTTACAAACACCGGCTCTTTATACTGCC 

GCAACTTTCGCCTCATCATTTTGGCCTCCCGATTCTAGTGGTGGCTCACCTGTTCCAGGG 

AACTCACCTCCGAATCTGGCTGCCATGGCCGCAGCCACTGTTGCAGCTGCTAGTGCTTGG 

TGGGCTGCCAATGGATTATTACCTTTATGTGCTCCTCTTAGTTCAGGTGGTTTCACTAGT 

CATCCTCCATCTACTTTTGGACCATCATGTGATGTAGAGTACACAAAAGCAAGCACTTTA 

CAACATGGTTCTGTGCAGAGCCGAGAGCAAGAACACTCCGAGGCATCAAAGGCTCGATCT 

TCACTGGACTCAGAGGATGTTGAAAATAAGAGTAAACCAGTTTGTCATGAGCAGCCTTCT 

GCAACACCTGAGAGTGATGCAAAGGGTTCAGATGGAGCAGGAGACAGAAAACAAGTTGAC 
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CGGTCCTCGTGTGGCTCAAACACTCCGTCGAGTAGTGATGATGTTGAGGCGGATGCATCA 

GAAAGGCAAGAGGATGGCACCAATGGTGAGGTGAAAGAAACGAATGAAGACACTAATAAA 

CCTCAAACTTCAGAGTCCAATGCACGCCGCAGTAGAATCAGCTCCAATATAACCGATCCA 

TGGAAGTCTGTGTCTGACGAGGGTCGAATTGCCTTCCAAGCTCTCTTCTCCAGAGAGGTA 

TTGCCGCAAAGTTTTACATATCGAGAAGAACACAGAGAGGAAGAACAACAACAACAAGAA 

CAAAGATATCCAATGGCACTTGATCTTAACTTCACAGCTCAGTTAACACCAGTTGATGAT 

CAAGAGGAGAAGAGAAACACAGGATTTCTTGGAATCGGATTAGATGCTTCAAAGCTAATG 

AGTAGAGGAAGAA<^GGTTTTAAACCATACAAAAGATGTTCCATGGAAGCCAAAGAAAGT 

AGAATCCTCAACAACAATCCTATCATTCATGTGGAACAGAAAGATCCCAAACGGATGCGG 

TTGGAAACTCAAGCTTCCACZATGAGACTCTATTTTCATCTGATCTGTTGTrTGTACTCTG 

TTTTTAAGTTTTCAAGACCACTGCTACATTTTCTTTTTCTTTTGAGGCCTTTGTATTTGT 

TTCCTTGTCGATAGTCTTCCTGTAACATTTGACTCTGTATTATTCAACAAATCATAAACT 
GTTTAATCTTTTTTTTTCCA 

>G214 Amino Acid Sequence (domain in AA coordinates: 22-71) 
METNSSGEDLVIKTRKPYTITKQRERWTEEEH^ 

IRSHAQKFFSKVEKEAEAKGVAMGQALDIAIPPPRPKRKPNNPYPRKTGSGTILMSKTGV 
NDGKESLGSEK^SHPEMANEDRQQSKPEEOT^ 

NASTFREFLPSREEGSQljmRWKESNSDIiNAKSIiENGNEQGPQTYPI^IP^ 

SSLSHPPSEPDSHPHTVAGDYQSFPNHIMSTLLQTPALYTAATFASSFWPPDSSGGSPVP 
GNSPPNLAAMAAATVAAASAWWAANGLLPLCAPIjSSG 

LQHGS VQSREQEHS EAS KARS S LD SEDVENKS KPVCHEQPSATPESDAKGSDGAGDRKQV 

DRSSCGSNTPSSSDDVEADASERQEDGTNGEVKETNEDTNKPQTSESNARRSRISSNITD 

PWKSVSDEGRIAFQALFSREVLPQSFTYREEHREEEQQQQEQRYPMALDLNFTAQLTPVD 

DQEEKROTGFLGIGLDASKLMSRGRTGFKPYra^ 

RLETQAST* 

>G2155 (63.. 740) 

CTCATATATACCAACCAAACCTCTCTCTGCATCTTTATTAACA.CAAAATTCCAAAAGATT 

AAATGTTGTCGAAGCTCCCTACACAGCGACACTTGCACCTCTCTCCCTCCTCTCCCTCCA 

TGGAAACCGTCGGGCGTCCACGTGGC^GACCTCGAGGTTCCAAAAACAAACCTAAAGCTC 

CAATCTTTGTCACCATTGACCCTCCTATGAGTCCTTACATCCTCGAAGTGCCATCCGGAA 

ACGATGTCGTTGAAGCCCTAAACCGTTTCTGCCGCGGTAAAGCCATCGGCTTTTGCGTCC 

TCAGTGGCTCAGGCTCCGTTGCTGATGTCACTTTGCGTCAGCCTTCTCCGGCAGCTCCTG 

GCTCAACCATTACTTTCCACGGAAAGTTCGATCTTCTCTCTGTCTCCGCCACTTTCCTCC 

CTCCTCTACCTCCTACCTCCTTGTCCCCTCCCGTCTCCAATTTCTTCACCGTCTCTCTCG 

CCGGACCTCAGGGGAAAGTCATCGGTGGATTCGTCGCTGGTCCTCTCGTTGCCGCCGGAA 

CTGTTTACTTCGTCGCCACTAGTTTCAAGAACCCTTCCTATCACCGGTTACCTGCTACGG 

AGGAAGAGCAAAGAAACTCGGCGGAAGGGGAAGAGGAGGGACAATCGCCGCCGGTCTCTG 

GAGGTGGTGGAGAGTCGATGTACGTGGGTGGCTCTGATGTCATTTGGGATCCCAACGCCA 

AAGCTCCATCGCCGTACTGACCACAAATCCATCTCGTTCAAACTAGGGTTTCTTCTTCTT 

TAGATCATGAAGAATGAACAAAAAGATTGCATTTTTAGATTCTTTGTAATATCATAATTG 

ACTCACTCTTTAATCTCTCTATCACTTCTTCTTTAGCTTTTTCTGCAGTGTCAAACTTCA 

CATATTTGTAGTTTGATTTGACTATCCCCAAGTTTTGTATTTTATCATACAAATTTTTGC 

CTGTCTCTAATGGTTGTTTTTTCGTTTGTATAATCTTATGCATTGTTTATTGGAGCTCCA 
GAGATTGAATGTATAATATAATGGTTTAAT 

>G2155 Amino Acid Sequence (domain in AA coordinates : 18-38) 

MLSKLPTQRHLHLSPSSPS^TVGRPRGRPRGSKNKPKAPIFVTIDPPMSPYILEVPSGN 

DVVEALNRFCRGKAIGFCVLSGSGSVADVTLRQPSPAAPGSTITFHGKFDLLSVSA 

PLPPTSLSPPVSNFFTVSLAGPQGKVIGGFVAGPLVAAGTVYFVATSFKNPSYHRLPATE 

EEQRNSAEGEEEGQSPPVSGGGGESMYVGGSDVIWDPNAKAPSPY* 

>G234 (106.. 1035) 

CACAACATCATACCCACCAACATATATAATGTTGATCATAGAGAGATAAACAGAGGCCGC 
TATCAAGAACAAGACTAAGAACAAGACTTCACTAGGAGTACAAGTATGGGAAGAGCACCG 
TGTTGTGACAAAGCAAACGTGAAGAAAGGGCCTTGGTCTCCTGAGGAAGATGCAAAACTC 
AAATCTTACATTGAAAATAGTGGCACCGGAGGCAATTGGATCGCTTTGCCTCAAAAGATT 
GGTTTAAAGAGATGTGGAAAGAGTTGCAGGCTGAGGTGGCTTAACTATCTTAGACCAAAC 
ATCAAACATGGTGGCTTCTCTGAGGAAGAAG7^AAACATCATTTGTAGCCTTTACCTTACA 
ATTGGTAGCAGGTGGTCTATAATCGCTGCTCAATTGCCGGGACGAACAGACAACGATATA 
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AAAAACTATTGGAACACGAGGCTCAAGAA.GAAACTCATTAACAAACAA.CGCAAGGAGCTT 

CAAGAAGCTTGTATGGAGCAGCAAGAGATGATGGTGATGATGAAGAGACAACACCAACAA 

CAACAAATCCAAACTTCTTTTATGATGAGACAAGACCAAAtAATGTTCACATGGCCACTA 

CATCATCATAATGTTCAAGTTCCAGCTCTTTTCAGAATCAAACCAACTCGTTTTGCGACC 

AAGAAGATGTTAAGCCAGTGCTCATCAAGAACATGGTCAAGATCGAAGATCAAGAACTGG 

AGAAAACAAACCTCATCATCATCAAGATTCAATGACAACGCTTTTGATCATCTCTCTTTC 

TCTCAACTCTTGTTAGATCCTAATCATAACCACTTAGGATCAGGAGAGGGTTTCTCCATG 

AACTCTATCTTGAGCGCCAACACAAACTCTCCATTGCTTAACACAAGTAATGATAATCAG 

TGGTTCGGGAATTTC(^GGCCGAAACCGTAAACTTGTTCTCAGGAGCCTCCACAAGTACT 

TCGGCAGATCAAAGCACTATAAGTTGGGAAGACATAAGCTCTCTTGTTTATTCTGATTCA 

AAGCAATTTTTTTAATTATAATAATATATTATTCTTAAGATGAAACGTACATCATTATTA 

TTAATTGGGGGTACGTAACGTATATATGGAATAACGATCTAGTTTGTTTAAATTTAAAA 

>G234 Amino Acid Sequence (domain in AA coordinates: 14-115) 

MGRAPCCDKANVKKGPWSPEEDAKLKSYIENSGTGGNW^ 

YLRPNIKHGGFSEEEENIICSLYLTIGSRWSIIAA 

QRKELQEACMEQQEMMVMMKRQHQQQQI QTS FMMRQDQTMFTWPLHHHNVQVPALFRI KP 
TRFATKKMLSQCSSRTWSRS KIKNWRKQTS S SSRFNDNAFDHLS FSQLLLDPNHNHLGSG 

EGFSMNSIIiSANTNSPLLNTSlTONQWFGNFQAETVNLFSGASTSTSADQSTISWEDISSI> 
VYSDSKQFF* 

>G361 (54.. 647) 

TCTGTCTCTCTCTCTCTCTTTGTAAATATACATATATAGATAAGCTCACATATATGGCGA 
CTGAAACATCTTCTTTGAAGCTCTTCGGTATAAACCTACTTGAAACGACGTCGGTTCAAA 
ACCAGTCATCGGAACCAAGACCCGGATCCGGATCAGGATCCGAGTCACGTAAGTACGAGT 
GTCAATACTGTTGTAGAGAGTTTGCTAACTCTCAAGCTCTTGGTGGTCACCAAAACGCTC 
ACAAGAAAGAGCGTCAGCTTCTTAAACGTGCACAGATGTTAGCTACTCGTGGTTTGCCAC 
GTCATCATAATTTTCACCCTCATACCAATCCGCTTCTCTCCGCCTTCGCGCCGCTGCCTC 
ACCTCCTCTCTCAGCCGCATCCTCCGCCGCATATGATGCTCTCTCCTTCTTCTTCGAGTT 
CTAAGTGGCTTTACGGTGAACACATGTCGTCACAAAACGCCGTTGGGTACTTTCATGGTG 
GAAGGGGACTTTACGGAGGTGGCATGGAGTCTATGGCCGGAGAAGTAAAGACTCATGGTG 
GTTCTTTGCCGGAGATGAGGAGGTTCGCCGGAGATAGTGATCGGAGTAGCGGAATTAAGT 
TAGAGAATGGTATTGGGCTGGACCTCCATTTAAGCCTTGGGCCATGAATGATTATAATTT 
TGGCCCAGTAAAGATCTGTAAAATACTACTAGGATTTCATTTTTATAGAGTATGTTTTTT 
TCCTTAATTTCGGTTGAAATTGGTGAATATTTTTATCTCTTACTTACCAAATCTCATATT 
TCTATGTATGCGTTTGCTTTCACTTTTTTTTTTTATATAATTCTTCTTGTAAAAAATGCA 

ATGTGAGTTTTCTTCCCTATCATTCTGTCAAGCTTTGGTTCAATTATTTAGTAATCGAAT 
AATATAGGAATAGTGTTGAAAG 

>G361 Amino Acid Sequence (domain in AA coordinates: 43-63) 

MATETSSLKIiFGINLLETTSVQNQSSEPRPGSGSGSESRKYECQYCCREFANSQALGGHQ 
NAHKK32RQLLKRAQMLATRGLPRHHNFHPHTNPL 

SSSKWLYGEHMSSQNAVGYFHGGRGLYGGGMESMAGEVKTHGGSLPEMRRFAGDSDRSSG 
I KLENGIGLDLHLSLGP * 
>G562 (137.. 1285) 

ATTTGAATTTCTGGGTTTCTCTCTGTTTAAGCTTCTTCTTCTTCATCTTCTGCTTACGTT 
TCTTCTTCAAGGAGCTTTCGGATTCTTGTAGAAAGAGTCATTGTTCTCTTGAGTGGGAAA 
CCTTGAAACCATTCCTATGGGAAATAGCAGCGAGGAACCAAAGCCTCCTACCAAATCAGA 
TAAACCATCTTCACCCCCGGTGGATCAAACAAATGTTCATGTCTACCCTGATTGGGCAGC 
TATGCAGGCATATTATGGTCCAAGAGTAGCAATGCCTCCTTATTACAATTCAGCTATGGC 
TGCATCTGGTCATCeTCCTCCTCCTTACATGTGGAATCCTCAGCATATGATGTCACCATC 
TGGAGCACCCTATGCTGCTGTTTATCCTCATGGAGGAGGAGTTTACGCTCATCCCGGTAT 
TCCCATGGGATCACTGCCTCAAGGTCAAAAGGATCCACCTTTAACAACTCCGGGGACGCT 
TTTGAGCATCGACACTCCTACTAAATCTACAGGGAACACAGACAATGGATTGATGAAGAA 
GCTGAAAGAGTTTGATGGGCTTGCTATGTCTCTAGGAAATGGGAATCCTGAAAATGGTGC 
AGATGAACATAAACGATCACGGAACAGCTCAGAAACTGATGGTTCTACTGATGGAAGTGA 
TGGGAATACAACTGGGGCAGATGAACCGAAACTTAAAAGAAGTCGAGAGGGAACTCCZAAC 
AAAAGATGGGAAACAATTGGTTCAAGCTAGCTCATTTCATTCTGTTTCTCCGTCAAGTGG 
TGATACCGGCGTAAAACTCATTCAAGGATCTGGAGCTATACTCTCTCCTGGTGTAAGTGC 
AAATTCCAACCCCTTCATGTCACAATCTTTAGCCATGGTTCCTCCTGAAACTTGGCTTCA 
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GAACGAGAGAGAACTGAAACGGGAGCGAAGGAAACAGTCTAATAGAGAATCTGCTAGAAG 

GTCAAGATTAAGGAAACAGGCCGAGACAGAAGAACTTGCTAGGAAAGTGGAAGCCTTGAC 

AGCCGAAAACATGGCATTAAGATCTGAACTAAACCAACTTAATGAGAAATCTGATAAACT 

AAGAGGAGCAAATGCAACCTTGTTGGACAAACTGAAATGCTCGGAACCCGAAAAGAGAGT 

CCCCGCAAATATGTTGTCTAGAGTTAAGAACTCAGGAGCTGGAGATAAGAACAAGAACCA 

AGGAGACAATGATTCTAACTCTACAAGCAAATTCCATCAACTGCTCGATACGAAGCCTCG 

AGCTAAAGCAGTAGCTGCAGGCTGAATCGATGGTAATTCATGTCGATTTCTACTTAATTT 

GTCGACATAAACAAAGAAAATAAGTGCTACTAATTTCAGAAAAACTTGATAGATAGATAG 

TATAGTAGAGAGAGAGAGAGAGAGAGAGGTGTGATGATTATTGATCTATAAATTTTCGGA 

GAGAGAGAGGGAGAAAGAGAAACTTTTCCTCCAGATGAAAATTTGGTGTTATGGTTTGTT 

ACTGTTAATATAGAGAGGCTTTTCTTTTTTTATAAAATGGCTTCCTTTGTTGCA 

>G562 Amino Acid Sequence (domain in AA coordinates: 253-315) 

MGNS SEEPKP PTKSDKPS S PP VDQTNVHVYPD WAAMQAYYGPRVAMPP YYNS AMAAS GHP 

PPPYMWNPQHMMSPSGAPYAAVYPHGGGVYAHPGIPMGSLPQGQKDPPLTTPGTLLSIDT 

PTKSTGNTDNGLMKKIjKEFDGIiAMSLGNGNPENGADEHKRSRNSSETDGSTDGSDGNTTG 

ADEPKLKRSREGTPTKDGKQLVQASSFHSVSPSSGDTGVKLIQGSGAILSPGVSANSNPF 

MSQSLAMVPPETWLQNERELKRERRKQSNRESARRSRLRKQAETEEI^KVEALTAENm 

LRSELNQLNEKSDKIjRGANATLIjDKLKC^ 

NSTSKFHQLLDTKPRAKAVAAG* 

>G591 (88.. 1020) 

GTAAATCTCTCTTTGAAGGTTCCTAACTCGTTAATCGTAACTCACAGTCACTCGTTCGAG 

TCAAAGTCTCTGTCTTTAGCTCAAACCATGGCTAGTAACT^ACCCTCACGACAACCTTTCT 

GACC2UUVCTCCTTCTGATGATTTCTTCGAGCAAATCCTCGGCCTTCCTAACTTCTCAGCC 

TCTTCTGCCGCCGGTTTATCTGGAGTTGACGGAGGATTAGGTGGTGGAGCACCGCCTATG 

ATGCTGCAGTTGGGTTCCGGAGAAGAAGGAAGTCACATGGGTGGCTTAGGAGGAAGTGGA 

CCAACTGGGTTTCACAATCAGATGTTTCCTTTGGGGTTAAGTCTTGATCAAGGGAAAGGA 

CCTGGGTTTCTTAGACCTGAAGGAGGACATGGAAGTGGGAAAAGATTCTCAGATGATGTT 

GTTGATAATCGATGTTCTTCTATGAAACCTGTTTTCCACGGGCAGCCTATGCAACAGCC^ 

CCTCCATCGGCCCCACATCAGCCTACTTCAATCCGTCCCAGGGTTCGAGCTAGGCGTGGT 

CAGGCTACTGATCCACATAGCATCGCTGAGCGGCTACGTAGAGAAAGAATAGCAGAACGG 

ATCAGGGCGCTGCAGGAACTTGTACCTACTGTGAACAAGACCGATAGAGCTGCTATGATC 

GATGAGATTGTCGATTATGTAAAGTTTCTCAGGCTCCAAGTCAAGGTTTTGAGCATGAAC 

CGACTTGGTGGAGCCGGTGCGGTTGCTCCACTTGTTACTGATATGCCTCTTTCATCATCA 

GTTGAGGATGAAACGGGTGAGGGTGGAAGGACTCCGCAACCAGCGTGGGAGAAATGGTCT 

AACGATGGGACTGAACGTCAAGTGGCTAAACTGATGGAAGAGAACGTTGGAGCCGCGATG 

CAGCTTCTTCAATCAAAGGCTCTTTGTATGATGCGAATCTCATTGGCAATGGCAATTTAC 

CATTCTCAACCTCCGGATACATCTTCAGTGGTCAAGCCTGAGAACAATCCTCCACAGTAG 

GATTTCTGCAATAAAGAGTTTGTACAGCTAATCCAACTGTCCAACATGGGTTTTTCTTCT 

GCTCTAATGACTCTGGTTTCTTCTCTCCTCTCTCACCGACTTGAAAGGTAAAAAAGTGAA 

AAAGGCTTTGTAGATGGAATCAATGTAGGATTTGCAGTAGAGGGCAAAAAAATGTCATAT 

AGCTCAATTGATCAAGTCTTAAAAAAAAAAAAAAAAAAAA 

>G59l Amino Acid Sequence (domain in AA coordinates: 143-240) 

I^ISTNPHDNLSDQTPSDDFFEQILGLPNFSASSAAGLSGVDGGLGGGAPPMMLQLGSGEE 

GSHMGGLGGSGPTGFHNQMFPLGIiSLDQGKGPGFLRPEGGHGSGKRFSDDVVDNRCSSMK 

PVFHGQPMQQPPPSAPHQPTS IRPRVRARRGQATDPHS IAERLRRERIAERIRALQELVP 

TVNKTDRAAM IDE I VD YVKFLRIjQ VKVLSMNRLGGAGAVAPLVTDMPLS S S VEDETGEGG 

RTPQPAWEKWSNDGTERQVAKLMEENVGAAMQLLQ 

WKPENNPPQ* 

>G8 (247.. 1596) 

AAAAAAAAATATCCGTCTCACTCTCTCGCCGCCGGTAACATTTCCCGGCGACAAAACTTC 
TCTACTCTCACCATTCCTCCATCGTAATCTCTAAATTCTTCTCCATTCTCTTCTTCCTCC 
CGATCATCTCGAGCTCTTCGTGAGAGATTATGTGATTATGTAATCGTTGTTGCTGTAGAA 
GACGATCTCTAACAACTGATTCCTTCATCATCACCTTCGCTAGATTTGTAATTTTCAGAG 
CTTGAGATGTTGGATCTTAACCTC7VACGCTGATTCTCCCGAGTCGACTCAGTACGGTGGT 
GACTCATACTTAGATCGGCAGACATCAGACAACTCCGCCGGGAATCGAGTGGAAGAGTCC 
GGTACATCGACGTCGTCAGTTATCAATGCCGATGGAGACGAAGACTCTTGCTCTACTCGA 
GCTTTCACTCTCAGTTTCGATATTTTAAAAGTCGGAAGTAGTAGCGGCGGAGACGAAAGC 
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CCCGCCGCTTCAGCTTCCGTTACTAAAGAGTTTTTTCCGGTGAGTGGAGACTGTGGACAT 
CTACGAGATGTTGAAGGATCATCAAGCTCTAGAAACTGGATAGATCTTTCTTTTGACCGT 
ATTGGTGACGGAGAAACGAAATTGGTAACTCCGGTTCCGACTCCGGCTCCGGTTCCGGCT 
CAGGTTAAAAAGAGTCGGAGAGGACCAAGGTCTAGAAGTTCACAGTATAGAGGAGTTACT 
TTTTATAGAAGAACTGGTCGATGGGAGTCACATATTTGGGATTGTGGGAAACAAGTTTAT 
TTAGGTGGTTTCGACACTGCTCATGCTGCAGCTAGAGCTTATGATCGAGCTGCTATTAAA 
TTTAGAGGTGTTGATGCTGATATCAACTTTACTCTTGGTGATTATGAGGAAGATATGAAA 
CAGGTACAAAACTTGAGTAAGGAAGAGTTTGTGCATATACTGCGTAGACAGAGCACGGGG 
TTTTCGCGGGGGAGTTCGAAGTATCGAGGGGTTACGTTACACAAATGTGGTAGATGGGAA 
GCTAGGATGGGGCAGTTTCTTGGTAAAAAGGCTTATGACAAGGCTGCAATCAACACTAAT 
GGTAGAGAAGO\GTCACGAACTTCGAGATGAGTTCATACCAAAATGAGATTAACTCTGAG 
AGCAATAACTCTGAGATTGACCTCAACTTGGGAATCTCTTTATCGACCGGTAATGCGCCA 
AAGCAAAATGGGAGGCTCTTTCACTTCCCTTCTAATACTTATGAAACTCAGCGTGGAGTT 
AGCTTGAGGATAGATAACGAATACATGGGAAAGCCGGTGAATACACCTCTTCCTTATGGA 
TCCTCGGATCATCGCCTTTACTGGAACGGAGCATGCCCGAGTTATAATAATCCCGCCGAG 
GGAAGAGCAACAGAAAAGAGAAGTGAAGCTGAAGGGATGATGAGTAACTGGGGATGGCAG 
AGACCGGGGCAAACAAGCGCCGTGAGACCGCAGCCACCGGGACCACAACCACCACCATTG 
TTCTCAGTTGCAGC^GCATCATCAGGATTCT(^CATTTCCGGCCA(^^CCTCCCAATGAC 
AATGCAACACGTGGTTACTTTTATCCACACCCTTAACTTGTAAGGGGACATATGAGAGTT 
TTTTTACCATCTCTCTCTCTCTCAACACTCTAGTCCCCTTTCAAAAATGTCATTTGGGTT 
TTAGATTTTTCACATACAATGATCAATTTTTCC 

>G8 Amino Acid Sequence (domain in AA coordinates: 151-217, 243-296) 
MLDLNIiNADSPESTQYGGDSYLDRQTSD^ 

TLSFDILKVGSSSGGDESPAASASVTKEFFPVSGDCGHLI^VEGSSSSR3WIDLSFDRIG 

DGETKLVTPVPTPAPVPAQVKKSRRGPRSRSSQYRGVTFYRRTGRWESHIWDCGKQVYIiG 

GFDTAHAAARAYDRAAIKFRGVDADINFTLGDYEEDMKQVQNLSKEEFVH 

RGS S KYRGVTLHKCGRWEARMGQFIiGKKAYDKAAINTNGREAVTNFEMS S YQNE INS ESN 

NS EIDLNLG I SL S TGNAPKQNGRLFHFPSNTYETQRGVS LR IDNEYMGKPVNTPLP YGS S 

DHRLYWNGACPSYNNPAEGRATEKRSEAEGMMSNWGWQRPGQTSAVRPQPPGPQPPPLFS 

VAAASSGFSHFRPQPPNDNATRGYFYPHP* 

>G859 (162.. 752) 

GATTTGTCATTXTTTGTCTAGCCAAAAAAAAAAAAAAAAAAGGAGAGAGAGAGAGA'GAGA 
GAGAGAGAGAGAAACGAAGAAAAAAAAAGAAGCAAAAAACATTGTGGGTCTCCGGTGATT 
AGGATCAAATTAGGGCACCAGCCTTATCGGAGGAAGAAGCCATGGGTAGAAAAAAAGTCG 
AGATCAAGCGAATCGAGAACAAAAGTAGTCGACAAGTCACTTTCTCCAAACGACGCAATG 
GTCTCATCGAGAAAGCTCGACAACTTTCAATTCTCTGTGAATCTTCCATCGCTGTTCTCG 
TCGTCTCCGGCTCCGGAAAACTCTACAAGTCTGCCTCCGGTGACAACATGTCAAAGATCA 
TTGATCGTTACGAAATACATCATGCTGATGAACTTGAAGCCTTAGATCTTGCAGAAAAAA 
CTCGGAATTATCTGCC^CTCAAAGAGTTACTAGAAATAGTCCAAAGCAAGCTTGAAGAAT 
CAAATGTCGATAATGCAAGTGTGGATACTTTAATTTCTCTGGAGGAACAGCTCGAGACTG 
CTCTGTCCGTAACTAGAGCTAGGAAGACAGAACTAATGATGGGGGAAGTGAAGTCCCTTC 
AAAAAACGGAGAACTTGCTGAGAGAAGAGAACCAGACTTTGGCTAGCCAGGTGGGGAAGA 
AGACGTTTCTGGTTATAGAAGGTGACAGAGGAATGTCATGGGAAAATGGCTCCGGCAACA 
AAGTACGGGAGACTCTTCCGCTGCTCAAGTAATCACCATCATCAACGGCTGAGCTTTCAC 
CTTAAACTTACAGCCTGATTCAGAAGTTTTTACAAATTTGTAAATTATAAAAAGCTTCAT 
AATAATCTCAACCTTTTTATCTTCCTCGCGCCAATGTGGAAATTAAGGTTAAAAATAAAA 
TAAAACAGAAGCTCATGCGAAAGAATTGTAAAACTAAGATAAAG CTATAGTAGATCTTTA 
TTGTACCTTCGTAGACGATATAAGATTTATTCGTGTGTTTGTCTTCCCCTCNAAAAAAAA 
AAAAAAAAAAAAAAAA 

>G859 Amino Acid Sequence (domain in AA coordinates: TBD) 

MGRKECVEIKRIENKSSRQVTFSKRRNGLIEKARQLSILCESSIAVLVVSGSGKLYKSASG 

DNMSKIIDRYEIHHADELEALDLAEKTRNYLPLKELLEIVQSKLEESNTO 

EEQLETALSVTRARKTELMMGEVKSLQKTENLLREENQTLASQVGKKTFLVIEGDRGMSW 

ENGSGNKVRETLPLLK* 

>G878 (197. .1738) 

CAAAAAAAATCTCTCCCATTAAAAGACTGCCCAAAGAAATATTTTATACAAAATGAAAGA 
GAGAAACACGACACGAATTTTGTATAATTAAGATTACACAAAAAAAAGTGTTAGAAAGAG 
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AAATATCTTCTTCTTTTTTCTGTGTGAGTTGGGTTTGTTAAAGTTTTATCCTTTTTGTTC 

TCAAAATCT^GAATCGATGGCGGAGAAGGAAGAAAAAGAACCATCGAAGTTAAAATCATC 

CACCGGAGTTTC^CGGCC?U^CGATTTCACTACCTCCTCGACCGTTTGGTGAAATGTTTTT 

TAGCGGTGGCGTTGGATTTAGTCCTGGACCAATGACTCTCGTCTCAAATTTATTCTCTGA 

TCCTGATGAGTTCAAGTCTTTCTCTCAGCTTTTAGCTGGAGCTATGGCTTCTCCGGCGGC 

AGCTGCTGTTGCCGCCGCTGCTGTGGTTGCTACTGCTCATCATCAGACACCTGTGAGCTC 

TGTCGGTGATGGCGGTGGAAGCGGTGGTGATGTTGACCCGAGGTTTAAGCAGAGTAGACC 

AACGGGATTGATGATAACTCAACCACCGGGGATGTTTACTGTACCGCCGGGGTTAAGTCC 

GGCTACTCTTTTGGATTCTCCGAGCTTCT^ 

TGGTATGACACATCAACAAGCTTTAGCACAAGTCACT 

TGTTCATATGCAGCAATCACAACAATCTGAA 

ACAACAACAACAAGCTTCATTGACTGAGATTCC^TCATTTTCTTCTGCACCTAGGTCTCA 
GATTCGAGCCTCGGTTCAAGAAACATCGCAGGGTCAGAGAGAGACTTCGGAAATATCTGT 
CTTTGAGCATCGGTCACAGCCTCAAAATGCTGACAAACCAGCTGATGATGGATACAACTG 
GCGGAAATATGGGCAGAAGCAAGTGAAGGGGAGCGATTTTCCTCGGAGTTATTACAAATG 
TACGCATCCAGCTTGTCCTGTCAAGAAGAAAGTGGAGAGGTCACTCGATGGACAAGTAAC 
GGAAAT<^TCTACJ^AGGGTCAACACAATCATGAGCTTCCTCAAAAGCGCGGTAACAATAA 
CGGGAGTTGTAAAAGTTCTGATATTGCAAATCAGTTTCAAACAAGTAATAGCAGTCTCAA 
CAAGAGTAAGAGGGACCAGGAAACAAGCCAAGTTACAACAACAGAGCAGATGTCTGAAGC 
AAGTGATAGCGAGGAGGTTGGGAATGCAGAGACTAGTGTGGGAGAAAGACATGAGGATGA 
GCCTGATCCCAAGCGAAGAAATACAGAAGTTCGGGTTTCAGAACCAGTTGCTTCATCGCA 
TAGAACTGTGACAGAGCCTAGGATTATTGTCCAAACGACGAGTGAAGTTGACCTCTTAGA 
TGATGGATATAGGTGGCGCAAGTATGGTCAGAAAGTAGTCAAAGGAAATCCTTATCCGAG 
GAGCTACTATAAGTGTACAACACCAGATTGCGGAGTAAGGAAACATGTAGAGAGAGCAGC 
AACTGACCCAAAAGCTGTTGTAACAACATATGAAGGTAAACATAACCATGATGTTCCAGC 
TGCTAGAACCAGCAGCCATCAGTTAAGACCAAACAATCAACACAACACCTCAACGGTTAA 
CTTC^TCATCAACAGCCTGTTGCACGTTTAAGGCTTAAAGAAGAGCAAATCACTTGACA 
GAGAAGAAGAATACGACGGCGCTTGAGCTTTTGTGAGTTTAATGAATCTTCTTTTTGGTT 
AATGAACCTGTTTTTGTTGCCTCAAAACACCAG 

TTACAGTTTCAAAAGGTATGTTCTTTTATTTCATGTTGGAATCTTCTGTGTAATCTTA^ 

AAGCTTTAGGAGGTAATGTAAAAAACCAGATTCAAAGTTATGCCCTTATGTGAATTCTTT 

TGTACATGGGATAAAGAA^TTTACAGGTATCCTTTTT 

AAAA 

>G878 Amino Acid Sequence (domain in AA coordinates : 250-305 , 415-475) 

MAEKEEKEPSKLKSSTGVSRPTISLPPRPFGEMFFSGGVGFSPGPMTLVSNIiFSDPDEFK 

SFSQLIjAGAMASPAAAAVAAAAVVATAHHQTPVSSVGDGGGSGGDVDPRFKQSRPTGLMI 

TQPPGMFTVPPGIjSPAITjLDS psffglfs plqgtfgmthqqalaqvtaqavqgnnvhmqq 
sqqseypsstqqqqqqqqqaslteipsfssaprsqirasvqetsqgqretseisvfehrs 
qpqnadkpaddgynwrkygqkqvkgsdfprs YYKCTHPACPVKKKVERSLDGQVTE 1 1 YK 
GQHNHELPQKRGNITOGSCKSSDIAN^ 

VGNAETSVGERHEDEPDPKRRNTEVRVSEPVASSHRTVTEPRIIVQTTSEVDLIjDDGYRW 
RK^GQKWKGNPYPRSYYKCTTPDCGVRKHVERAATDPKAVVTTYEGKHNHD 
HQLRPNNQHNTSTVNFNHQQPVARLRLKEEQIT* 
>G971 (131.. 1171) 

TTTTTTTTCTTCCCTCTTTTAGAACTCTCTCTCTCTCTCGTTTTTGACACTTATCCTCTC 
TCTTTTTTCTCTCTCCCTCTCTCTCTGGCCGGAAAAAAGAACAACGTCGTTTATAGCTAA 
AGATTCGATCATGTTGGATCTTAACCTAAAGATCTTTTCTTCTTATAACGAAGATCAAGA 
TCGGAAAGTACCATTAATGATCTCAACCACCGGTGAAGAAGAATCTAACTCATCTTCCTC 
CTCCACAACAGACTCTGCAGCGAGAGATGCTTTCATCGCTTTTGGAATTCTCAAACGCGA 
CGATGACCTTGTTCCTCCTCCTCCTCCTCCTCCTCATAAAGAAACAGGAGATCTCTTTCC 
GGTGGTGGCTGATGCTCGTCGGAATATAGAATTCTCCGTGGAAGACAGTCACTGGTTGAA 
TCTTTCTTCTTTACAAAGAAATACACAGAAAATGGTGAAGAAGAGCAGAAGAGGACCAAG 
GTCTCGTAGCTCCCAATATCGTGGCGTCACTTTTTACCGTCGCACCGGTCGTTGGGAATC 
TCATATTTGGGATTGTGGAAAGCAAGTTTATTTGGGCGGGTTTGATACTGCTTACGCAGC 
AGCAAGGGCTTACGACCGAGCTGCTATCAAATTCCGTGGTCTCGATGCAGACATCAATTT 
CGTCGTGGATGATTATAGGCATGACATCGATAAGATGAAGAATTTAAATAAGGTGGAGTT 
CGTGCAAACACTTAGGCGAGAGAGTGCGAGTTTCGGAAGAGGAAGTTCCAAATACAAAGG 
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CTTGGCTCTTCAAAAATGGACCCAATTCAAAACTCATGATCA 

CAGGGGATGGGATGCAGCAGCAATAAAATACAATGAGTTGGGAAAGGGAGAAGGAGCCAT 
GAAGTTTGGTGCCCATATCAAAGGAAATGGTCACAATGATCTTGAACTAAGTCTCGGAAT 
TTCATCATCATCGG7UVAGTATAAAGTTGACAACAGGCGATTACTATAAGGGTATCAATCG 
GTCCACGATGGGTTTATACGGTAAGCAATCATCGATATTTTTACCCATGGCAACCATGAA 
ACCTCTGAAGACAGTTGCAGCATCATCAGGATTCCCTTTTATCAGCATGACAAGTTCCTC 
TTC CTCCATGTCCAATTGTTTTGATC CATAGGATCGTTCTACACTCTCTTAACTAATATA 
TATTTTTACTCTATCTGATTATTGTATACAAGGATAAAATTTGATTCTTTCCTTAATGAG 
TGAGAAATATTGGAAGTGTTAAAAAAAAAAAAAAAAAAAAAAA 

>G971 Amino Acid Sequence (conserved domain in aa coordinates: 120-186) 
MLDLNLKIFSSYNEDQDRKVPLMISTTGEE 

VPPPPPPPHKETGDLFPWADARRNIEFSVEDSHT^bNIiSSLQRNTQKW 

SQYRGVTFYRRTGRWESHIWDCGKQVYLGGFDTAYAAARAYDRAAIKFRGLDADINFVVI) 

DYRHDIDKMKNLNKVEFVQTLRRESASFGRGS 

DAAAI KYNELGKGEGAMKFGAHIKGNGHNDLELSLGI S S S S E S I KLTTGD YYKGINRSTM 

GLYGKQSSIFLPMATMKPLKTVAASSGFPFISMTSSSSSMSNCFDP* 

>G975 (58.. 657) 

ATTACTCATCATCAAGTTCCTACTTTCTCTCTGACAAACATCACAGAGTAAGTAAGAATG 

GTACAGACGAAGAAGTTCAGAGGTGTCAGGCAACGCCATTGGGGTTCTTGGGTCGCTGAG 

ATTCGTCATCCTCTCTTGAAACGGAGGATTTGGCTAGGGACGTTCGAGACCGCAGAGGAG 

GCAGCAAGAGC^TACGACGAGGCCGCCGTTTTAATGAGCGGCCGC^^CGCCAAAACCAAC 

TTTCCCCTCAACAACAACAACACCGGAGAAACTTCCGAGGGCAAAACCGATATTTCAGC^ 

TCGTCCACAATGTCATCCTCAACATCATCTTCATCGCTCTCTTCCATCCTCAGCGCCAAA 

CTGAGGAAATGCTGCAAGTCTCCTTCCCCATCCCTCACCTGCCTCCGTCTTGACACAGCC 

AGCTCCCATATCGGCGTCTGGCAGAAACGGGCCGGTTCAAAGTCTGACTCCAGCTGGGTC 

ATGACGGTGGAGCTAGGTCCCGCAAGCTCCTCCCAAGAGACTACTAGTAAAGCTTCACAA 

GACGCTATTCTTGCTCCGACCACTGAAGTTGAAATTGGTGGCAGCAGAGAAGAAGTATTG 

GATGAGGAAGAAAAGGTTGCTTTGCAAATGATAGAGGAGCTTCTCAATACAAACTAAATC 

TTATTTGCTTATATATATGTACCTATTTTCATTGCTGATTTACAGCCAAAATAATCAATT 

ATACCGTGTATTTTATAGATGTTTTATATTAAAAGGTTGTTAGATATA 

>G975 Amino Acid Sequence (domain in AA coordinates: 4-71) 

MVQTKKFRGVRQRHWGSWVAEIRHPIiLK^ 

NFPLirarcnrc^ 

ASSHIGVWQKRAGSKSDSSWVMTVELGPASSSQETTSKASQDAJCIiAPTTEVEIGGSREEV 

LDEEEKVALQMIEELLNTN* 

>G994 (180. .917) 

TGTATATATAGTTAGTTAGTTGAGATAAACTTGGTTACCACTTTTGTGTGGTCTTTCTTT 
TTCTTTTTCTCCATTTTCCATTTATCGACCCCTTGGGTGTAGCTAATTACTTTCGCGATT 
TTCAAATCCAATAAAGTTTTAATTTGATGAAGCTTTTTTTAAACCATATAATATAAATAA 
TGGGTGGTCGTAAACCATGTTGTGATGAGGTTGGATTAAGAAAGGGTCCATGGACAGTGG 
AAGAAGATGGG7^AACTAGTTGATTTCTTAAGGGCACGTGGCAACTGCGGTGGTGGTGGAG 
GAGGATGGTGCTGGAGAGACGTGCCAAAACTGGCGGGGCTAAGGAGGTGTGGCAAAAGTT 
GCCGTCTCCGGTGGACTAATTATCTCCGGCCAGATCTCAAGAGAGGTCTTTTTACTGAAG 
AAGAAATCCAACTAGTCATTGATCTTCATGCTCGCCTTGGCAATAGATGGTCGAAGATTG 
CAGTGGAGTTACCAGGAAGAACAGACAACGATATCAAAAATTATTGGAACACTCATATAA 
AGAGGAAGCTTATAAGAATGGGTATTGATCCAAACACACATCGTCGATTTGACCAACAAA 
AAGTCAACGAGGAGGAAACGATATTGGTCAACGATCCAAAGCCTCTGTCTGAGACCGAGG 
TATCTGTTGCTTTGAAGAATGACACGTCAGCAGTGTTATCAGGAAATCTAAACCAATTGG 
CTGACGTGGACGGTGATGATCAGCCGTGGAGCTTTCTAATGGAAAATGACGAAGGAGGAG 
GTGGCGACGCCGCCGGAGAGCTTACGATGCTATTGTCCGGTGACATTACGTCATCATGTT 
CTTCTTCGTCATCTTTGTGGATGAAGTATGGAGAATTCGGATACGAAGATTTAGAACTTG 
GATGTTTCGATGTTTAGAGATTCAAGTATGTTTAATTAGGCCGTAGGTTGATTAATCATA 
AGGTTCATTGACTTCATTCTAGAATTGTGTAGTTGGACCAGTATAAAGAATCAAAGTTAT 
GAAACATTGTAATTTGATTTCCAAATTAATCTAATGAATAAATGTGCTTTGCAAAAAAAA 
AAAAAAAAAAAAAAA 

>G994 Amino Acid Sequence (domain in AA coordinates: 14-123) 
MGGRKPCCDEVGLRKGPWTVEEDGKLVDFLRARGNCGGGGGGWCWRDVPKIiAGLRRCGKS 



285 



WO 03/013227 PCT/US02/25805 

286/286 



CRLRWTNYLRPDLKttGLFTEEEIQLVIDLHA^ 

KRIOjIRMGIDPNTHRRFDQQKVNEEETILVNDPKPLSETEVSVALKMTSAVLSGNLNQL 
ADVDGDDQPWSFLMENDEGGGGDAAGELTMLLSGDITSSCSSSSSLWMKYGEFGYEDLEL 
GCFDV* 

>G2347 (81. .626) 

AGCCCATCCTTCAACATTGCTTCCTAACCAGAAATCCACCATCATCTTCCCACGAATACA 
ACTTAAAGCTTTACCAGAAAATGGAGGGTCAGAGAACACAACGCCGGGGTTACTTGAAAG 
ACAAGGCTACAGTCTCCAACCTTGTTGAAGAAGAAATGGAGAATGGCATGGATGGAGAAG 
AGGAGGATGGAGGAGACGAAGACAAAAGGAAGAAGGTGATGGAAAGAGTTAGAGGTCCTA 
GCACTGACCGTGTTCCATCGCGACTGTGCCAGGTCGATAGGTGCACTGTTAATTTGACTG 
AGGCCAAGCAGTATTACCGCAGACACAGAGTATGTGAAGTACATGCAAAGGCATCTGCTG 
CGACTGTTGCAGGGGTCAGGCAACGCTTTTGTCAACAATGCAGCAGGTTTCATGAGCTAC 

CAGAGTTTGATGAAGCTAAAAGAAGCTGCAGGAGGCGCTTAGCTGGACACAATGAGAGGA J3 
GGAGGAAGATCTCTGGTGACAGTTTTGGAGAAGGGTCAGGCCGGAGAGGGTTTAGCGGTC fT\ 
AACTGATCCAGACTCAAGAAAGAAACAGGGTAGACAGGAAACTTCCTATGACCAACTCAT C/> 
CATTCAAGCGACCACAGATCAGATAAACCCTCCCGCTCTCTCTCTTCTGTCATCTACATA "™ 
TGCTCTATCTACACTCTTATTAGACAAATAATGGCATCTAACAATGTGAAGAAAAGTTGG ^> 
TCATGGTATTAAATCCTACACGGATATATAACTATAAACCTCTAGTCCCCTCTATGCTGT 

CCTGTAATGAATATCTATCCGGAAATGTATTCGCATAGTCTTGCGTCTAATAATGTTTAT ^> 
TGATTTTGTA 

>G2347 Amino Acid Sequence (domain in AA coordinates; 60-136) 
MEGQRTQRRGYLKDKATVSNLVEEEMENGMDGEEEDGGDEDKRKK^ERVRGPSTDRVPS 
RLCQVDRCTVNLTEAKQYYRRHRVCEVHAKASAATVAGVRQRFCQQCSRFHELPEFDEAK 
RSCRRRLAGHNERRRKISGDSFGEGSGRRGFSGQLIQTQERlSrRVDRKLPMTNSSFKRPQI 
R* 

>G2010 (1..525) 

ATGGAGGGTAAGAGATCACAAGGACAAGGXTACATGAAAAAGAAGTCTTACCTTGTGGAA f*\ 
GAAGATATGGAGACTGATACGGATGAAGAAGAGGAAGTAGGTAGGGATAGAGTTAGAGGG ^pf 
TCTAGAGGTAGCATCAATCGTGGTGGCTCGTTGCGGCTTTGCCAAGTAGATAGATGCACA * 
GCTGATATGAAAGAGGCAAAACTGTATCACCGGAGACACAAAGTGTGTGAAGTTCATGCA ^ 
AAGGCATCTTCTGTCTTTCTCTCAGGACTTA^^ 

TTTCATGACCTCCAAGAGTTTGATGAAGCTAAGAGAAGTTGCAGGAGGCGCTTAGCTGGA 
CACAATGAGCGAAGAAGGAAGAGCTCTGGTGAGAGTACTTATGGAGAAGGATCAGGTCGG 
AGAGGAATCAATGGTCAGGTGGTGATGCAGAATCAAGAAAGATCAAGGGTAGAGATGACA 
CTTCCTATGCCAAAGTCATCATTCAAGCGACCACAGATTAGATAG 

>G2010 Amino Acid Sequence (domain in AA coordinates: 53-127) 
MEGKRSQGQGYMKKKS YLVEEDMETDTDEEEEVGRDRVRGSRGS INRGGSLRLCQVDRCT " 
ADMKEAKLYHRRHK^CEVHAKASSVFL^ 

HNERRRKSSGESTYGEGSGRRGINGQVVMQNQERSRVEMTLPMPNSSFKRPQIR* 



